Lists (1)
Sort Name ascending (A-Z)
Stars
Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.
Cosmos-Reason2 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
This repository contains scripts for converting an ALOHA HDF5 dataset into the Lerobot, RMB and GR00T formats.
Isaac Lab - Arena is a robotics simulation framework that enhances NVIDIA Isaac Lab by providing a composable, scalable system for creating diverse simulation environments and evaluating robot lear…
The Unitree simulation environment built based on Isaac Lab
This repository implements teleoperation of the Unitree humanoid robot using XR Devices.
NVIDIA Isaac Sim™ is an open-source application on NVIDIA Omniverse for developing, simulating, and testing AI-driven robots in realistic virtual environments.
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!���0实现,不依赖任何框架。
🔥 The first open-sourced diffusion vision-langauge-action model.
GeRM: A Generalist Robotic Model with Mixture-of-Experts for Quadruped Robot https://songwxuan.github.io/GeRM/
Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.
LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]
Native Multimodal Models are World Learners
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
egocentric humanoid manipulation benchmark
Humanoid dataset for learning
Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
Intrinsic Image Diffusion for Single-view Material Estimation
Isaac Sim/Lab in AWS, Azure, Google Cloud, Alibaba Cloud

