We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a *single* teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. - Website: https://lnkd.in/gxzgeP-2 - Paper: https://lnkd.in/g7PJdz_8
Best Practices for Humanoid Robot Training Programs
Explore top LinkedIn content from expert professionals.
Summary
Best practices for humanoid robot training programs focus on methods and approaches that help robots learn complex skills and adapt to real-world environments by mimicking human behavior, using simulation, and incorporating diverse data sources. These training programs aim to build robots that not only imitate actions but develop a deeper understanding of how the world works, allowing them to handle new tasks and unpredictable conditions.
- Expand training data: Use a variety of human videos, vision-language datasets, and cross-robot demonstrations to build robots that can generalize their skills and adapt to unfamiliar situations.
- Utilize simulation environments: Create diverse, realistic virtual scenarios with changing physics, terrain, and obstacles to prepare humanoid robots for the unpredictability of the real world.
- Streamline learning pipelines: Design training systems that allow for rapid iteration and parallel simulation, helping robots learn complex tasks like walking, grasping, and decision-making in a fraction of the time it would take in real life.
-
-
Closing the sim-to-real gap in humanoid robotics requires massive simulation throughput and high-fidelity physics validation. WPP recently detailed their engineering pipeline, showing how they reduced reinforcement learning cycle times for complex humanoid locomotion from 24 hours down to less than 60 minutes. The hardware architecture relies on Google Cloud’s new G4 VMs (powered by NVIDIA RTX PRO 6000 Blackwell GPUs) running NVIDIA Isaac Sim, integrated closely with DeepMind’s MuJoCo physics engine. The mechanics: The team mapped raw human mocap data (over 200 degrees of freedom) down to a constrained 29-DOF OpenUSD digital twin. By leveraging a P2P GPU topology to bypass central processing bottlenecks, the infrastructure executed over 3 billion simulations in under an hour. The virtual environment continuously introduced physical micro-variances—simulated pushes, shifting floor friction, and momentum changes—to train the model against the chaos of the real world. The resulting reinforcement learning model was condensed into a highly efficient ONNX policy and deployed directly to the physical robot. This edge policy processes live IMU and joint telemetry to output immediate, stabilized motor commands. Reaching this scale of simulation volume is the precise engineering mechanism that allows control policies to handle unstructured physical deployment. To support the research, Unitree has open-sourced the underlying RL code on GitHub. Blog post : https://lnkd.in/g4-gWzTP #Robotics #PhysicalAI #ReinforcementLearning #MuJoCo #GoogleCloud #IsaacSim #Engineering
-
𝗛𝗨𝗠𝗔𝗡𝗢𝗜𝗗 𝗥𝗢𝗕𝗢𝗧𝗜𝗖𝗦: 𝗪𝗵𝗲𝗿𝗲 𝗦𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗕𝗲𝗴𝗶𝗻𝘀 𝗮𝗻𝗱 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗔𝘄𝗮𝗸𝗲𝗻𝘀 Deep Learning in 3D Simulation is not a lab exercise. It is the moment we begin to teach machines how to exist. Not to repeat motions. Not to merely follow code. But to learn, adapt, balance, reason, and act with purpose. In my project we are not just building robots. We are building a new class of intelligence that experiences the world before it ever touches reality. In these simulation environments, gravity does not remain constant. Terrain does not always cooperate. Obstacles change shape. Sensors lie. Friction shifts. And the humanoid must still stand, walk, grasp, adjust, optimize, and choose its next step. Domain randomization, reinforcement learning, hierarchical policies, and graph neural dependencies no longer sound like academic theory. They become survival tools. Machines begin to develop strategies. They learn how to carry payloads across unstable rubble. They learn energy discipline when battery is low and temperature is high. They learn trajectory planning not as geometry, but as survival logic. When you combine photorealistic environments from Isaac Sim, contact-perfect physics in MuJoCo, embodied navigation in Habitat, and emergent behavior in Unity, you begin to see something different. You see machines build experience. You see memory. You see policy retention. You see adaptation. You see the beginning of abstract perception where simulation is not just testing, but education. The difference between teaching a robot how to walk, and letting it discover how to navigate a collapsing environment with intelligence and intent. This is where humanoid robotics becomes human oriented. Robots that can open doors without templates. Carry supplies without pre-programmed routes. Coordinate convoys. Assist in evacuation. Make real time physical decisions aligned with mission objectives, not static instructions. Simulation gives us time compression. We can give a single humanoid what would have taken humans years of trial. We can compress thousands of failures into one informed policy. This is how we transform capability. Not automation. Cognitive autonomy. Not motion planning. Motion intelligence. Not digital twins. Learning twins. We are building humanoids that do not just survive the environment. They learn from it. If you are in advanced simulation, deep learning pipelines, physics engines, reinforcement learning, biomechanics, embodied cognition, ROS2, Isaac Sim, MuJoCo, Omniverse, Habitat, Unity, Unreal, LLM integration, perception or policy optimization… Then we should not be working apart. We should be building this together. And for those ready to build the next generation of thinking humanoids Singularity Systems is now accepting collaborators, researchers, engineers, architects, and visionaries. Let’s teach machines how to exist. #changetheworld #3D #unity
-
Tired of waiting hours for humanoids to learn to walk? Our new technical report shows how to train sim-to-real humanoid locomotion in 15 minutes with FastSAC and FastTD3! The full pipeline is open-source in the newly released Holosoma codebase. Project page: https://lnkd.in/dxiytcs9 Original FastTD3 showed strong off-policy RL potential but only on a 12-DoF T1 humanoid with "frozen" upper body. We scale up FastSAC & FastTD3 to full-body humanoid locomotion trained in 15 minutes -- significantly outperforming PPO. With careful design choices and minimalist reward functions, FastSAC and FastTD3 enable rapid end-to-end training of humanoid locomotion. Robots learn to walk in any direction and stay robust to pushes, all from just 15 minutes of end-to-end training on a single RTX 4090. Our results go beyond locomotion, we demonstrate the sim-to-real deployment of whole-body tracking with off-policy RL algorithms -- FastSAC can complete the full sequence of dancing motion that lasts more than 2 minutes! We see clear signs of scalability: performance improves with more parallel simulation and more gradient steps. Even better, the Holosoma codebase fully supports multi-GPU and multi-node training. We kept things intentionally simple, and every implementation is available in the Holosoma repo. Can’t wait to see what the community builds from our recipe! Work done at Amazon FAR with Younggyo Seo Juyue Chen Guanya Shi Rocky Duan Pieter Abbeel Arxiv link: https://lnkd.in/dYeGjtNe
-
Humanoid robots need to adapt to different tasks, like moving around, handling objects while walking, and working on tables, each requiring a unique way to control the robot’s body. For instance, moving around focuses on tracking how fast the robot's base is moving, while working at a table relies more on controlling the robot's arm movements. Many current methods train robots with specific controls for each task, making it hard for them to switch between tasks smoothly. This new approach suggests using whole-body motion imitation to create a common base that can work for all tasks, helping robots learn general skills that apply to different types of control. With this idea, researchers developed HOVER (Humanoid Versatile Controller), a system that combines different control modes into one shared setup. HOVER allows robots to switch between tasks without losing the strengths needed for each one, making humanoid control easier and more flexible. This approach removes the need to retrain the robot for each task, making it more efficient and adaptable for future uses. The diverse team of researchers that developed HOVER come from: NVIDIA, Carnegie Mellon University, University of California, Berkeley, The University of Texas at Austin, and UC San Diego. 📝 Research Paper: https://lnkd.in/eMatAxMu 📊 Project Page: https://lnkd.in/eY4gzmme #robotics #research