The moment AI learned to think in 3D just happened. AI at Meta's V-JEPA 2 isn't just another model release—it's the dawn of physical intelligence. While everyone's been obsessing over language models, Meta quietly solved something far more fundamental:- teaching AI to understand the physics of reality. Here's what blew my mind:- • Trained on 1M+ hours of raw video (no labels needed). • Achieves 65-80% success on robot tasks with just 62 hours of robot data. • 30x faster than NVIDIA's competing model. • Enables zero-shot robot control in completely new environments. But here's the real breakthrough—V-JEPA 2 understands that a ball will fall when dropped, objects don't vanish when hidden, and actions have consequences. It's basically giving AI the intuitive physics that every child develops by age 2. Why this changes everything for builders:- → No more massive datasets:- Skip the "astronomical amounts" of training data that robotics usually demands. → True generalization:- Deploy in new environments without retraining. → Physics-first reasoning:- AI that plans based on how the world actually works. → Open source advantage:- Meta released everything—models, code, benchmarks. I've been building with agentic AI systems, and this represents the missing piece: spatial intelligence that rivals human intuition. The applications are staggering:- • Warehouse robots that adapt to new products instantly. • Autonomous vehicles with genuine spatial awareness. • Manufacturing systems that predict and prevent failures. • AR experiences that feel truly grounded in reality. The best part? They made it completely open source. The research community can now iterate on world models without starting from zero. This isn't just about robotics—it's about AI that finally understands the 3D world we live in. What are you going to build with physics-aware AI? 🔗 Interactive demo:- https://ai.meta.com/vjepa/ 📜 Paper:- https://lnkd.in/dscUusY7 📚 GitHub:- https://lnkd.in/dkkaSyG9 🔬 Try it:- https://lnkd.in/d6gEuz43 📹 Video :- https://lnkd.in/d2kHkRfQ #AI #PhysicsAI #WorldModels #Robotics #AgenticAI #Innovation #Meta #OpenSource #FutureOfAI
Robotics Research Progress at Meta
Explore top LinkedIn content from expert professionals.
Summary
Robotics research progress at Meta highlights breakthroughs in artificial intelligence that allow robots to understand and interact with the physical world more intuitively, using models trained from raw video and images. This work aims to build AI systems that can reason about physics, plan actions, and improve themselves—making robotics more adaptable, scalable, and accessible for real-world tasks.
- Explore open-source tools: Take advantage of Meta’s released models and code to experiment with robotics applications without needing massive datasets or specialized skills.
- Streamline 3D creation: Use new AI-powered tools to quickly turn everyday videos, photos, or text prompts into usable 3D environments for simulation and testing.
- Embrace self-improving AI: Look into recent advances where robots can modify their own learning process, enabling constant improvement and adaptability across different tasks.
-
-
This week's defining shift for me is that creating 3D data is getting much simpler. New tools are turning everyday inputs like smartphone video, single photos, and text prompts into usable 3D environments and assets. This lowers the barrier to building the scenes, objects, and spaces that robotics, simulation, and immersive content rely on. It also shifts 3D creation from a specialized skill to something all teams can generate quickly and at the scale modern spatial systems require. This week’s news surfaced signals like these: 🤖 Parallax Worlds raised $4.9 million to turn standard video into digital twins for robotics testing. The platform turns basic walkthrough videos into interactive 3D spaces that teams can use to run their robot software and see how it performs before sending anything into the field. 🪑 Meta introduced SAM 3D to reconstruct objects and people from single images, producing full-textured meshes even when subjects are partly hidden or shot from difficult angles. The models were trained using real-world data and a staged process to improve accuracy. 🌏 Meta unveiled WorldGen, a research tool that generates full 3D worlds from text prompts. It produces complete, navigable spaces that can be used in Unity or Unreal and shows how AI can create environments without manual modeling. Why this matters: Faster 3D pipelines expand who can build, test, and refine spatial ideas. They turn 3D creation from a bottleneck into a regular part of development, which opens the door to more experimentation and better decisions earlier in the process. #robotics #digitaltwins #simulation #VR #AR #virtualreality #spatialcomputing #physicalAI #AI #3D
-
This is one of the most interesting papers on self-improving agents for this year. (bookmark this one) Most self-improving AI systems hit the same wall: the mechanism that generates improvements is fixed and can't improve itself. This new work from Meta and collaborators breaks through this limitation. They introduce Hyperagents, self-referential agents where the self-improvement process itself is editable. The DGM-Hyperagent combines a task agent and a meta agent into a single modifiable program, enabling metacognitive self-modification. It autonomously discovers innovations like persistent memory and performance tracking, and these meta-improvements transfer across domains and compound across runs. Why does it matter? - On paper review, DGM-H improved from 0.0 to 0.710 test accuracy. - On robotics reward design, it went from 0.060 to 0.372. - Transfer hyperagents achieved 0.630 on Olympiad-level math grading in a domain they were never trained on. This is a step toward AI systems that don't just find better solutions but continuously improve how they search for improvements.
-
BREAKING: Meta’s Chief AI Scientist Yann LeCun just dropped LeWorldModel, a new world model that trains stably end-to-end from raw pixels. That matters because most people are watching the next Claude release while a much deeper infrastructure problem is getting solved: can AI learn a usable model of the physical world without a giant pile of training tricks? LeWorldModel’s pitch is unusually concrete: • only two loss terms • tunable loss hyperparameters cut from six to one • just 15M parameters • trains 48x faster than foundation-model-based world models The core claim is the important part: LeWorldModel is the first JEPA that trains stably end-to-end from raw pixels using a next-embedding prediction loss plus a regularizer that pushes latent embeddings toward a Gaussian distribution. Translation for builders: this is an attempt to make world models simpler, smaller, and far less brittle. If that holds up, the win is not “one more model.” It’s a cleaner recipe for systems that can predict what should happen next — and flag physically implausible events when the world state stops making sense. Yann LeCun has been pushing this direction for years. When the 2018 Turing Award winner says the future may depend less on next-token prediction and more on learning world models, papers like this are the receipts. The signal here is speed + stability + simplicity in one package. That’s how research jumps from demo to usable system 🔥
-
Meta AI Releases V-JEPA 2: Open-Source Self-Supervised World Models for Understanding, Prediction, and Planning Meta AI has released V-JEPA 2, an open-source video world model designed to learn from large-scale unlabeled video data using a self-supervised joint-embedding predictive architecture. Trained on over 1 million hours of internet-scale video and 1 million images, V-JEPA 2 excels at motion understanding, action anticipation, and video question answering. It achieves state-of-the-art performance on benchmarks like Something-Something v2 and Epic-Kitchens-100, without requiring language supervision during pretraining. Its architecture scales to over 1B parameters, leveraging advanced pretraining strategies such as progressive resolution and temporal extension to enable robust video representation learning. In addition to perception tasks, Meta introduces V-JEPA 2-AC—an action-conditioned extension trained on just 62 hours of robot interaction data. This version enables zero-shot planning and manipulation on real-world robotic arms, performing tasks like grasping and pick-and-place using visual goals alone. Compared to other models like Octo and Cosmos, V-JEPA 2-AC offers faster inference and higher task success rates, without task-specific tuning or rewards. Together, V-JEPA 2 and its variants showcase a scalable and efficient path toward general-purpose embodied AI..... 🧲 Read full article: https://lnkd.in/gH2BTZa7 🎓 Paper: https://lnkd.in/gQEDYMMQ 🔥 Models on Hugging Face: https://lnkd.in/g_Gw9ZW9 💡 GitHub Page: https://lnkd.in/gwCQj8wc Meta AI at Meta #artificialintelligence #robotics #ai #opensource
-
Meta just unveiled V-JEPA 2 and it might be the next leap toward truly intelligent AI agents. The idea? Teach AI to understand the physical world like humans (or at least pets) do. V-JEPA 2 is Meta’s new “world model,” trained on over a million hours of video, designed to help AI reason about real-world physics: how objects move, what happens next, and what actions make sense. It's how a child knows a ball will bounce, or how a robot might understand it's time to move eggs from a pan to a plate. Meta claims V-JEPA 2 is 30x faster than Nvidia’s Cosmos model, a bold statement, though worth noting that benchmarks vary. What stands out to me isn’t just the speed or scale. It’s this: We’re witnessing the rise of AI that doesn’t just respond to prompts, but begins to understand context, motion, and cause-effect dynamics. That opens the door to real-world robotics, smarter automation, and more intuitive machine interactions. At KnubiSoft, where we’re constantly thinking about how software integrates with the real world whether it's test automation, AI-driven tools, or user experiences this pushes us to reimagine what’s possible. The age of physical-world-aware AI isn’t science fiction anymore. It’s already knocking on the door. How do you see “world models” changing the future of robotics or automation? #AI #MachineLearning #VJEPA2 https://lnkd.in/e8dzn-GT
-
🤖 Meta just made a quiet move… But it signals something much bigger than robotics. They acquired Assured Robot Intelligence (ARI). A team focused on teaching robots how to understand and adapt to humans in real environments. This isn’t a hardware play. It’s a learning loop play. 🧠 Most people will say: “Meta is getting into humanoid robots” That’s not the interesting part. The real move is this: training AI in the physical world. * Not just text * Not just images * Not just simulations But real-world interaction, feedback, and adaptation. ⚠️ Why this matters Today’s AI is trained on static data. Tomorrow’s AI will be trained on experience. * Touch * Movement * Trial and error * Environmental feedback That’s how you move from “smart assistant” → general intelligence systems. And Meta knows it. 🏗️ What ARI actually brings The founding team isn’t random. * Xiaolong Wang (ex-Nvidia, UCSD) * Lerrel Pinto (ex-NYU, founded Fauna Robotics) They’ve been working on foundation models for physical behavior. Not just “move arm from A to B” But: * predict human behavior * adapt in dynamic environments * learn continuously That’s a different category. 🔑 The strategic shift We’re watching AI move from: * digital intelligence → embodied intelligence * training on data → training on interaction * copilots → physical agents And here’s the uncomfortable truth: You don’t get to AGI sitting behind a screen. 💰 Why the market looks so messy Estimates range from $38B → $5T for humanoid robotics. That’s not disagreement. That’s uncertainty about timing. Everyone agrees this is big. No one agrees when it actually works at scale. 💭 The takeaway Meta isn’t chasing robots. They’re chasing the next training frontier. Because whoever owns that… Owns what comes after today’s models. 💭 Question: If intelligence is about interaction with the real world… which companies are actually positioned to train it? #AI #Robotics #AGI #Meta #Innovation #TechStrategy #FutureOfWork