7 lessons from AirSim: I ran the autonomous systems and robotics research effort at Microsoft for nearly a decade and here are my biggest learnings. Complete blog: https://sca.fo/AAeoC 1. The “PyTorch moment” for robotics needs to come before the “ChatGPT moment”. While there is anticipation towards Foundation Models for robots, scarcity of technical folks well versed in both deep ML and robotics, and a lack of resources for rapid iterations present significant barriers. We need more experts to work on robot and physical intelligence. 2. Most AI workloads on robots can primarily be solved by deep learning. Building robot intelligence requires simultaneously solving a multitude of AI problems, such as perception, state estimation, mapping, planning, control, etc. We are increasingly seeing successes of deep ML across the entire robotics stack. 3. Existing robotic tools are suboptimal for deep ML. Most of the tools originated before the advent of deep ML and cloud and were not designed to address AI. Legacy tools are hard to parallelize on GPU clusters. Infrastructure that is data first, parallelizable, and integrates cloud deeply throughout the robot’s lifecycle is a must. 4. Robotic foundation mosaics + agentic architectures are more likely to deliver than monolithic robot foundation models. The ability to program robots efficiently is one of the most requested use cases and a research area in itself. It currently takes a technical team weeks to program robot behavior. It is clear that foundation mosaics and agentic architecture can deliver huge value now. 5. Cloud + connectivity trumps compute on edge – Yes, even for robotics! Most operator-based robot enterprises either discard or minimally catalog the data due to a lack of data management pipelines and connectivity. Given that robotics is truly a multitasking domain – a robot needs to solve for multiple tasks at once. Connection to the cloud for data management, model refinement, and the ability to make several inference calls simultaneously would be a game changer. 6. Current approaches to robot AI Safety are inadequate Safety research for robotics is at an interesting crossroads. Neurosymbolic representation and analysis is likely an important technique that will enable the application of safety frameworks to robotics. 7. Open source can add to the overhead As a strong advocate for open-source, much of my work has been shared. While open-source offers many benefits, there are a few challenges, especially for robotics, that are less frequently discussed: Robotics is a fragmented and siloed field, and likely initially there will be more users than contributors. Within large orgs, the scope of open-source initiatives may also face limits. AirSim pushed the boundaries of the technology and provided a deep insight into R&D processes. The future of robotics will be built on the principle of being open. Stay tuned as we continue to build @Scafoai
Autonomous Robotics for Advanced Applications
Explore top LinkedIn content from expert professionals.
Summary
Autonomous robotics for advanced applications refers to intelligent robots capable of performing complex tasks independently by leveraging AI, machine learning, and adaptive systems. These innovations are shaping industries from search-and-rescue to research and production, offering groundbreaking capabilities for dynamic and challenging environments.
- Focus on adaptability: Develop robots with multi-modal abilities, such as transforming their forms or behaviors, to navigate diverse environments like disaster zones or space exploration settings.
- Leverage data and AI: Use cloud connectivity to enable continuous learning and real-time decision-making, allowing robots to handle multitasking and optimize performance effectively.
- Invest in safety and scalability: Employ advanced frameworks, such as agentic architectures or self-driving platforms, to enhance robotics automation and reliability in real-world applications.
-
-
🚀 𝗔𝗱𝘃𝗮𝗻𝗰𝗶𝗻𝗴 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝘀 𝘄𝗶𝘁𝗵 𝗩‐𝗝𝗘𝗣𝗔 𝟮 - Excited to share the latest milestone from one of the research teams I work with at Meta FAIR Core Learning & Reasoning. V‑JEPA 2 is a world model to understand physical reality, anticipate outcomes, and plan efficient strategies. 𝗪𝗲, 𝗵𝘂𝗺𝗮𝗻𝘀 𝘂𝘀𝗲 𝗼𝘂𝗿 𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝘁𝗼 𝗽𝗿𝗲𝗱𝗶𝗰𝘁 𝗵𝗼𝘄 𝘁𝗵𝗲 𝘄𝗼𝗿𝗹𝗱 𝘄𝗶𝗹𝗹 𝗿𝗲𝘀𝗽𝗼𝗻𝗱 to our actions—or the actions of others— all the time, especially when planning what actions to take and how to best navigate new situations. 𝗩‐𝗝𝗘𝗣𝗔 𝟮 𝗶𝘀 𝗮 𝘀𝗲𝗹𝗳-𝘀𝘂𝗽𝗲𝗿𝘃𝗶𝘀𝗲𝗱 𝘄𝗼𝗿𝗹𝗱 𝗺𝗼𝗱𝗲𝗹 that reasons in learnt latent space - not in token space like LLMs. The encoder and predictor are pre-trained through self-supervised learning from visual data, leveraging abundant natural videos (more than 1 million hours) from diverse sources to bootstrap physical world understanding and prediction. 𝗔𝘀 Yann LeCun 𝗵𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝗲𝗱 𝘁𝗼𝗱𝗮𝘆 𝗮𝘁 𝗩𝗶𝘃𝗮𝗧𝗲𝗰𝗵, this represents a fundamental shift toward AI systems that can reason, plan, and act through physical world models. To reach advanced machine intelligence, AI must go beyond perception and understand how the physical world works—anticipating dynamics, causality, and consequences. V‑JEPA 2 does just that. 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 a𝘀 𝗮 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝗲𝘅𝗮𝗺𝗽𝗹𝗲: with just 62 hours of robot interaction and no task-specific training, V‑JEPA 2 enables zero-shot planning and achieves up to 80% task success, reducing latency from minutes to seconds. It’s a leap toward practical, adaptable robots that can operate in dynamic, real-world environments. 𝗧𝗵𝗲 𝘁𝗲𝗰𝗵 𝗰𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆 𝗶𝘀 𝘁𝗮𝗸𝗶𝗻𝗴 𝗻𝗼𝘁𝗲 and on Hacker News, one commenter summed it up well: “𝘐𝘵'𝘴 𝘪𝘮𝘱𝘰𝘳𝘵𝘢𝘯𝘵 𝘵𝘰 𝘬𝘦𝘦𝘱 𝘴𝘰𝘮𝘦 𝘱𝘦𝘳𝘴𝘱𝘦𝘤𝘵𝘪𝘷𝘦: 𝘵𝘩𝘦𝘳𝘦 𝘢𝘳𝘦 𝘻𝘦𝘳𝘰 𝘳𝘰𝘣𝘰𝘵𝘴 𝘪𝘯 𝘵𝘩𝘦 𝘸𝘪𝘭𝘥, 𝘢𝘵 𝘵𝘩𝘦 𝘮𝘰𝘮𝘦𝘯𝘵, 𝘵𝘩𝘢𝘵 𝘶𝘴𝘦 𝘢 𝘸𝘰𝘳𝘭𝘥 𝘮𝘰𝘥𝘦𝘭 𝘵𝘰 𝘸𝘰𝘳𝘬 𝘰𝘯 𝘵𝘢𝘴𝘬𝘴 𝘵𝘩𝘦𝘺 𝘸𝘦𝘳𝘦𝘯'𝘵 𝘴𝘱𝘦𝘤𝘪𝘧𝘪𝘤𝘢𝘭𝘭𝘺 𝘵𝘳𝘢𝘪𝘯𝘦𝘥 𝘰𝘯. 𝘛𝘩𝘪𝘴 𝘪𝘴 𝘤𝘶𝘵𝘵𝘪𝘯𝘨 𝘦𝘥𝘨𝘦 𝘳𝘦𝘴𝘦𝘢𝘳𝘤𝘩 ...”. - ricardobeat. 💡 If you’re passionate about physical world modeling, causal understanding, or building intelligent systems that interact with the real world—join us. More information, the paper, and a bunch of explanation videos to check here: 🔗 𝗦𝗶𝘁𝗲: https://ai.meta.com/vjepa 🔗 𝗕𝗹𝗼𝗴: https://lnkd.in/gwccegvT 🔗 𝗣𝗮𝗽𝗲𝗿: https://lnkd.in/gFGDi4xG 🔗 𝗠𝗲𝘁𝗮 𝗖𝗮𝗿𝗲𝗲𝗿𝘀: https://lnkd.in/g-MaBMc5 #AI #WorldModels #VJEPA #MetaAI #FAIR #EmbodiedAI #Robotics #CausalReasoning #LeCun #MachineLearning
-
A single robot that can drive like a car, stand upright to get a better view, crawl over tricky terrain, and even take off like a drone - all by adjusting the same four “limbs.” That’s what the M4 Morphobot from Caltech accomplishes. Each wheel can swivel and fold into different positions: as standard wheels for rolling, as “legs” to step over uneven ground, or as propellers for flight. In doing so, this machine sidesteps the limitations that often come with single-purpose designs. How does it work? The M4 carries sensors and an onboard AI processor (NVIDIA Jetson Nano) that help it monitor its surroundings and plan routes in real time. For instance, it uses SLAM (Simultaneous Localization and Mapping) to create a map of the area on the fly, then relies on path-planning algorithms (like A*) to pick the best way forward. If it meets a gap or obstacle that rolling wheels can’t handle, it can switch modes - standing up to get a better look or converting into a drone to fly over the blockage. In real-world situations like search-and-rescue, one type of movement isn’t always enough. Think about collapsed buildings, rugged wilderness, or areas struck by natural disasters. A robot with such adaptability could roll quickly across clear ground, crawl under rubble, and then lift off to reach otherwise inaccessible places - all without specialized add-ons or multiple machines. For space exploration, a “rover-drone hybrid” could tackle rocky planetary surfaces, then take flight to jump over craters or cliffs. NASA’s interest in multi-modal designs hints at a future where one shape-shifting robot might replace several single-mode explorers. What do you think about the future of multi-modal robots with the power of AI? #innovation #technoloy #future #management #startups
-
The AI landscape is evolving beyond traditional models. We’re now entering the Agentic AI era, where autonomous agents don’t just respond to queries but plan, coordinate, and execute complex workflows—bringing true intelligence to automation. Agentic AI refers to AI systems composed of multiple autonomous agents that can: • Decompose complex tasks into subtasks • Collaborate through structured workflows • Leverage external tools & data for enriched decision-making • Self-optimize based on feedback & environmental changes Unlike standard AI models, Agentic AI doesn’t wait for human prompts—it takes initiative, makes decisions, and dynamically adjusts its actions based on real-time data. 𝗛𝗼𝘄 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜 𝗪𝗼𝗿𝗸𝘀: ➟ The Actor (Initiator) – The system or user triggering the workflow. ➟ The Supervisor (Orchestrator) – Manages tasks, delegates work, and monitors execution. ➟ AI Agents (Executors) – Autonomous units that perform tasks, collaborate, and adapt based on outcomes. ➟ External Tools & Data (Enhancers) – Includes RAG (Retrieval-Augmented Generation), search engines, computational resources, and APIs to augment knowledge and improve results. Think of it as an AI-powered assembly line, where different agents specialize in specific jobs, ensuring efficiency and scalability. 𝗪𝗵𝘆 𝗗𝗼𝗲𝘀 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜 𝗠𝗮𝘁𝘁𝗲𝗿? ➟ Scalability – AI agents work in parallel, handling multi-step processes efficiently. ➟ Adaptability – They adjust dynamically to changing inputs, making them more reliable than static AI models. ➟ Autonomous Decision-Making – Unlike traditional AI that waits for instructions, Agentic AI actively solves problems and suggests improvements. ➟ Enhanced Productivity – By integrating external knowledge sources like RAG, search, and APIs, Agentic AI learns in real-time and delivers more accurate results. 𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜 ➟ AI-powered ETL Pipelines – Automating data extraction, transformation, and loading with autonomous workflow execution. ➟ AI-Driven Research Assistants – Multi-agent systems retrieving and synthesizing information from external sources. ➟ Autonomous Software Development – Agents writing, debugging, and deploying code without human intervention. ➟ Customer Support Automation – AI agents that dynamically adjust responses, perform transactions, and resolve issues without human escalation. This is just the beginning of Agentic AI. As AI agents become more autonomous, we will see systems that: ➟ Self-improve by learning from failures and adapting to new challenges. ➟ Collaborate across different domains—AI agents working alongside humans in business, healthcare, finance, and tech. ➟ Expand reasoning capabilities through multi-modal data processing, integrating text, images, audio, and more. 𝗔𝗿𝗲 𝘆𝗼𝘂 𝗿𝗲𝗮𝗱𝘆 𝗳𝗼𝗿 𝘁𝗵𝗲 𝗻𝗲𝘅𝘁 𝘄𝗮𝘃𝗲 𝗼𝗳 𝗔𝗜 𝗶𝗻𝗻𝗼𝘃𝗮𝘁𝗶𝗼𝗻?
-
Where will AI and Digital technologies take R&D in the long run? That question comes up often in my conversations with R&D leaders. The near-term benefits are clear—improved data access, streamlined lab workflows, AI copilots—but the 5-year horizon still feels foggy. One compelling direction is the rise of Self-Driving Laboratories (SDLs). SDLs sit at the intersection of AI, robotics, and materials science—beautifully captured in the Venn diagram from Low et al.’s recent ACS Omega Perspective (https://lnkd.in/eMUTfAzJ). Here’s how the pieces come together: 🔹Software: AI plans experiments, analyzes outcomes, and adapts strategies in real time. 🔹Hardware: Robotics executes those experiments with speed, precision, and around-the-clock repeatability. 🔹Materials Science: Domain experts embed knowledge of feasibility and safety to keep the system grounded. Together, they form a closed loop that allows chemistry labs to run autonomously and accelerate discovery cycles by orders of magnitude. Of course, SDLs are still emerging. Devices remain specialized, robot coordination is complex, data systems are siloed, and AI planning needs refinement. Yet the architecture is perfectly poised to ride the next wave of AI, digital twins, and advanced robotics. What’s your take on SDLs or their role in your R&D program? Let’s compare notes.