✨Introducing 𝗥𝗢𝗠𝗔𝗡: A New Approach to 𝗥𝗼𝗯𝗼𝘁 𝗟𝗼𝗰𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 Mason Peterson, a PhD student at Massachusetts Institute of Technology, has introduced 𝗥𝗢𝗠𝗔𝗡 (𝗥obust 𝗢bject 𝗠ap 𝗔lignment A𝗻ywhere), an innovative method for 𝘃𝗶𝗲𝘄-𝗶𝗻𝘃𝗮𝗿𝗶𝗮𝗻𝘁 𝗴𝗹𝗼𝗯𝗮𝗹 𝗹𝗼𝗰𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻, now available as a 𝗥𝗢𝗦 𝟮 package. This work addresses the challenges of aligning maps in complex environments, especially when robots observe the 𝘀𝗮𝗺𝗲 𝘀𝗰𝗲𝗻𝗲 from 𝗼𝗽𝗽𝗼𝘀𝗶𝘁𝗲 𝘃𝗶𝗲𝘄𝗽𝗼𝗶𝗻𝘁𝘀. --- ♦️𝗪𝗵𝘆 𝗥𝗢𝗠𝗔𝗡? Current visual SLAM approaches often struggle with loop closures in environments where robots face 𝗼𝗽𝗽𝗼𝘀𝗶𝗻𝗴 𝗱𝗶𝗿𝗲𝗰𝘁𝗶𝗼𝗻𝘀 or where scenes are observed from significantly 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗽𝗲𝗿𝘀𝗽𝗲𝗰𝘁𝗶𝘃𝗲𝘀. ROMAN addresses these issues by leveraging 𝗼𝗽𝗲𝗻-𝘀𝗲𝘁 𝗼𝗯𝗷𝗲𝗰𝘁 𝗺𝗮𝗽𝗽𝗶𝗻𝗴 and incorporating 𝚘̲𝚋̲𝚓̲𝚎̲𝚌̲𝚝̲ ̲𝚐̲𝚎̲𝚘̲𝚖̲𝚎̲𝚝̲𝚛̲𝚢̲, 𝚜̲𝚑̲𝚊̲𝚙̲𝚎̲, and 𝚜̲𝚎̲𝚖̲𝚊̲𝚗̲𝚝̲𝚒̲𝚌̲ ̲𝚎̲𝚖̲𝚋̲𝚎̲𝚍̲𝚍̲𝚒̲𝚗̲𝚐̲𝚜̲ into its data association process. By enabling robots to detect loop closures under such conditions, ROMAN significantly improves localization accuracy, making it particularly useful for multi-robot systems and large-scale collaborative tasks. --- 💬𝗛𝗼𝘄 𝗱𝗼𝗲𝘀 𝗶𝘁 𝘄𝗼𝗿𝗸? ROMAN consists of three main components: 1️⃣ 𝗠𝗮𝗽𝗽𝗶𝗻𝗴: Tracks object segments across RGB-D images, building detailed segment maps. 2️⃣ 𝗗𝗮𝘁𝗮 𝗔𝘀𝘀𝗼𝗰𝗶𝗮𝘁𝗶𝗼𝗻: Aligns maps by combining semantic attributes, shape geometry, and a gravity prior, ensuring robust matching even in complex scenes. 3️⃣ 𝗣𝗼𝘀𝗲 𝗚𝗿𝗮𝗽𝗵 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Optimizes robot trajectories using loop closures and visual-inertial odometry (VIO). This pipeline results in improved global localization and better trajectory estimation, even in challenging environments. --- 📽️𝗗𝗲𝗺𝗼 𝗢𝘃𝗲𝗿𝘃𝗶𝗲𝘄 To better understand ROMAN’s capabilities, I ran the demo following the instructions provided in their GitHub repository. The demo uses a subset of the 𝗞𝗶𝗺𝗲𝗿𝗮 𝗠𝘂𝗹𝘁𝗶 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 and showcases ROMAN’s open-set object mapping and object-based loop closure. Watch the attached video to see the demo in action. --- 𝗘𝘅𝗽𝗹𝗼𝗿𝗲 𝗥𝗢𝗠𝗔𝗡 ROMAN is open-source and ready for integration into your robotics projects: 🔗 𝗦𝗼𝘂𝗿𝗰𝗲 𝗖𝗼𝗱𝗲: https://lnkd.in/dpJKUyV8 🔗 𝗥𝗢𝗦 𝟭 𝗣𝗮𝗰𝗸𝗮𝗴𝗲 (𝗪𝗿𝗮𝗽𝗽𝗲𝗿): https://lnkd.in/dGPCMBat 🔗 𝗥𝗢𝗦 𝟮 𝗣𝗮𝗰𝗸𝗮𝗴𝗲 (𝗪𝗿𝗮𝗽𝗽𝗲𝗿): https://lnkd.in/dZbDh7Rn 🔗 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗣𝗮𝗽𝗲𝗿: https://lnkd.in/dpgVd9eF 🔗 𝗠𝘆 𝗚𝗶𝘁𝗛𝘂𝗯: https://lnkd.in/d5Y3Kpve This work represents an important step forward in global localization and multi-robot collaboration. Great job to Mason Peterson on this excellent contribution to the robotics community! #Robotics #ROS2 #Localization #SLAM #Mapping #VIO #VisualSLAM
Latest Localization Technology in Robotics
Explore top LinkedIn content from expert professionals.
Summary
The latest localization technology in robotics refers to advanced methods that help robots determine their exact position and navigate complex environments, even when traditional satellite-based systems like GPS are unavailable or unreliable. These new approaches use a mix of sensors, data fusion, and high-precision mapping to keep robots aware of their surroundings and ensure accurate movement.
- Embrace sensor fusion: Combine inputs from cameras, LiDAR, inertial sensors, and magnetometers to create a richer, more reliable picture of a robot’s location.
- Leverage 3D mapping: Build and reference detailed 3D maps of the environment so robots can localize themselves even in areas where GPS signals are blocked or degraded.
- Prioritize real-time corrections: Use onboard computing and collaborative networks to apply corrections and adapt to changing conditions for consistent navigation performance.
-
-
We’ve invested deeply in building the Bee’s positioning system from the ground up — not just plugging in off-the-shelf GPS, but rethinking how a device on the move should understand exactly where it is. Here’s a look at what’s happening under the hood to make the Bee’s localization both precise and resilient: Signal-level GNSS access for superior sensor fusion 1. The Bee directly taps into raw GNSS signals rather than relying solely on pre-processed latitude/longitude outputs. By accessing these low-level pseudorange, carrier phase, and doppler measurements, the Bee tightly integrates satellite data with its onboard IMU and magnetometer. This tight coupling produces consistently more accurate positioning results compared to standard, loosely integrated sensor fusion approaches. 2. Satellite fault detection and exclusion. We already know that our GNSS module is using some satellites that they shouldn’t because the signal is reflected, non-line-of-sight, etc. With signal-level access we can determine on our own which satellites to use or not, and thus improve positioning. 3. The Bee employs advanced Factor Graph Optimization (FGO) methods, combining multiple sensor inputs—including GNSS, IMU, and visual data—into a unified solution. FGO techniques have proven effective in high-precision positioning competitions, and the Bee leverages these methods to achieve improved location accuracy, especially in challenging urban environments. Crowdsourced Corrections The Bee benefits from a distributed global network of sensors, enabling it to perform self-generated ionospheric corrections. Instead of purchasing corrections from scattered base stations, the Bee uses its own extensive sensor network to directly measure and mitigate ionospheric interference. This approach provides accurate atmospheric corrections even in areas lacking traditional GNSS corrections infrastructure. Visual positioning for GNSS denied areas The Bee, by recognizing known visual features (aka map features), the Bee localizes itself—even in GNSS-denied scenarios such as tunnels or dense urban settings. Over time, these mapped features further refine accuracy, creating a positive feedback loop for enhanced positioning performance.
-
We're all familiar with the bag of words (BoW) approach commonly used in natural language processing to represent text as an unordered collection of words. But did you know that researchers have now extended this concept to 3D point clouds for real-time loop closure detection in LiDAR-based simultaneous localization and mapping (SLAM)? In a recent robotics class, I learned about BoW3D, an innovative bag of words framework tailored for 3D LiDAR point clouds. The core idea behind BoW3D is to construct the vocabulary using LinK3D, an efficient and pose-invariant 3D point cloud descriptor that facilitates precise point-to-point matching. By representing the 3D features as words in a vocabulary and indexing them using a hash table, BoW3D enables quick retrieval of previously visited locations. The true potential of BoW3D becomes apparent when it is integrated into a LiDAR odometry system. It not only efficiently detects loop closures but also calculates the complete 6-DoF pose transformation between the current and matched historical scans in real-time. This loop correction serves as a vital constraint for pose graph optimization, helping to minimize drift and ensure global consistency. Rigorous testing on the KITTI dataset has shown that BoW3D surpasses state-of-the-art methods in both place recognition accuracy and computational efficiency. With an impressive average processing time of a mere 48ms per scan, BoW3D exhibits significant potential for enabling robust, large-scale 3D mapping in real-world scenarios. As 3D sensors become more and more common in robotics and autonomous systems, being able to detect loop closures and correct drift in real-time is going to be absolutely essential. That's why I'm so excited about BoW3D. Learning about this framework in class has really sparked my curiosity, and I can't wait to see how it evolves and shapes the future of 3D perception. :)
-
Navigation Without GNSS: The New Operational Standard in Drone Warfare The war in Ukraine has proven that the era of UAVs relying solely on GNSS is over. The battlespace is saturated with electronic warfare systems that disrupt satellite signals across multiple frequencies. In this environment, even advanced CRPA antennas with eight elements have become ineffective. Jamming now comes from multiple directions with overwhelming power, rendering traditional spatial filtering obsolete. A recent case on the Sumy axis illustrates the shift. After a Superkam (Skat) UAV was shot down, investigators found a high-precision altimeter and an onboard microcomputer. This indicates the use of terrain-referenced navigation—specifically, digital elevation models (DEMs) that allow a UAV to determine its position by comparing terrain profiles rather than relying on external signals. Once reserved for cruise missiles (like TERCOM), this technology has now been adapted for tactical drones. This is no longer experimental. UAVs like the V2U have been operating with terrain-matching capabilities for over a year. In parallel, visual navigation using EO or IR cameras with SLAM algorithms is gaining traction. These systems allow drones to localize themselves by comparing live camera feeds to reference imagery, even in complete GNSS denial. Inertial Navigation Systems (INS) provide short-term positional awareness using internal sensors. Though they suffer from drift, they are highly valuable when fused with other data sources—terrain, visual, or barometric. Advanced UAVs now rely on multi-sensor fusion: combining INS, altimeters, EO/IR imagery, and map data to create resilient, redundant navigation systems. A growing trend is local radio-based navigation using pseudo-satellites, RF beacons, or LTE/5G triangulation. In combat zones, however, reliance on national infrastructure is impractical. Instead, tactical forces must create their own positioning grid, using UAVs or ground-based transmitters. This evolution demands a new mindset. Enhancing GNSS resilience is no longer enough. The very architecture of navigation must be rethought. Resilience must come from independence, not reinforcement. Key implications: All medium- and long-range UAVs must support GNSS-free navigation. Terrain and visual databases are now strategic assets. INS and onboard computing are essential, not optional. Command systems must assume operations in GNSS-denied environments as the norm, not the exception. In modern warfare, the winner won’t be the one with the strongest signal—but the one who no longer needs it. Autonomous navigation in signal-denied environments will define next-generation UAV effectiveness. If you’re designing a drone today, the first question should be: How will it navigate when nothing works? Because that is the new baseline.
-
🎥 GNSS-denied localization using 3D LiDAR with #HiveRobotics I'm thrilled to share a new case study developed in collaboration with #HiveRobotics, who designs autonomous delivery robots operating in highly diverse environments such as dense urban areas, vacation resorts and private sites. These use cases are particularly challenging from a localization standpoint: 🚫 GNSS-denied or GNSS-degraded environments caused by high buildings in urban environments, but also dense tree canopies and narrow paths 🤖 Small ground robots, driving close to the surface with limited sensor height 🌄 Uneven and non-standard terrains, introducing additional noise and dynamics in perception and motion To address these constraints, #HiveRobotics deploys Exwayz’s map-based localization approach, which relies on the creation of high-accuracy 3D #LiDAR maps of the operating environment and embedded real-time localization running fully onboard the robot. By leveraging stable 3D geometric features of the environment, this approach delivers robust and precise positioning. This enables reliable autonomous navigation in complex outdoor settings, including areas where satellite-based localization may be limited or unavailable. The video below shows the robot’s real-time 3D LiDAR #localization overlaid on the map, demonstrating stable positioning in real operating conditions. I personally love the 3D map background, where we can barely see the ground because of the vegetation density and height 🌲 Many thanks to the #HiveRobotics team for their trust 🙏🏻 and congratulations to them for tackling such demanding real-world autonomy use cases, this is where robotics truly gets interesting 🤖 A detailed case study is available on our website 👉🏻 link in the comments for those who want to explore the technical aspects further! #Robotics #AutonomousSystems #LiDAR #3DMapping #Localization #GNSSDenied #OutdoorAutonomy #Exwayz #HiveRobotics Hassan Bouchiba Antoine Plat Romain Bonjean
-
🚀 GenZ-LIO: Rethinking Degeneracy-Aware LiDAR-Inertial Odometry Excited to share our latest work, GenZ-LIO, which is an extension of our GenZ-ICP to LiDAR-inertial odometry system! 🎉 📄 Paper: https://lnkd.in/gNUJHgDn 🔍 Motivation A wide range of prior works have proposed strategies to improve robustness against geometric degeneracy in LiDAR-Inertial Odometry (LIO). Building upon this line of research, we revisit degeneracy-aware LIO from a unified perspective and extend these ideas toward a more generalizable framework, aiming for a system that is both robust and computationally efficient in practice. 💡 Key Insight Inspired by GenZ-ICP, GenZ-LIO jointly addresses scale and geometric degeneracy through two tightly coupled mechanisms. 👉 It combines a PID-inspired adaptive voxelization that regulates point density across varying scene scales, with a hybrid-metric optimization that balances point-to-plane and point-to-point constraints based on reliability. 👉 Together, this enables seamless indoor–outdoor transitions, without requiring manual tuning or environment-specific adjustments. 📊 Results We thoroughly compared our GenZ-LIO with other state-of-the-art approaches, and—we bet our test environments are quite harsh—we demonstrate that our GenZ-LIO prevents divergence, while some baseline approaches diverge due to extreme transitions. Would love to hear your thoughts! 🙌 #Robotics #SLAM #LiDAR #LIO #Degeneracy #AutonomousSystems #StateEstimation #Localization #SensorFusion #LiDAR #Research
-
OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics https://lnkd.in/edzZmVFx Simultaneous Localization and Mapping (SLAM) is a foundational component in robotics, AR/VR, and autonomous systems. With the rising focus on spatial AI in recent years, combining SLAM with semantic understanding has become increasingly important for enabling intelligent perception and interaction. Recent efforts have explored this integration, but they often rely on depth sensors or closed-set semantic models, limiting their scalability and adaptability in open-world environments. In this work, we present OpenMonoGS-SLAM, the first monocular SLAM framework that unifies 3D Gaussian Splatting (3DGS) with open-set semantic understanding. To achieve our goal, we leverage recent advances in Visual Foundation Models (VFMs), including MASt3R for visual geometry and SAM and CLIP for open-vocabulary semantics. These models provide robust generalization across diverse tasks, enabling accurate monocular camera tracking and mapping, as well as a rich understanding of semantics in open-world environments. Our method operates without any depth input or 3D semantic ground truth, relying solely on self-supervised learning objectives. Furthermore, we propose a memory mechanism specifically designed to manage high-dimensional semantic features, which effectively constructs Gaussian semantic feature maps, leading to strong overall performance. Experimental results demonstrate that our approach achieves performance comparable to or surpassing existing baselines in both closed-set and open-set segmentation tasks, all without relying on supplementary sensors such as depth maps or semantic annotations. --- Newsletter https://lnkd.in/emCkRuA More story https://lnkd.in/eMFcEekQ LinkedIn https://lnkd.in/ehrfPYQ6 #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning #ComputerVision
-
ROLO-SLAM [1]: Rotation-Optimized LiDAR-Only SLAM in Uneven Terrain with Ground Vehicle. It is a lightweight and robust LiDAR-based SLAM solution designed to improve the accuracy of pose estimation for ground vehicles in rough terrains. It incorporates several algorithmic innovations that reduce pose estimation drifts, particularly in the vertical direction, which are commonly observed when navigating uneven terrains. The method includes forward location prediction to coarsely eliminate the location differences between consecutive scans, enabling separate and accurate localization and orientation determination. Additionally, ROLO-SLAM features a parallel-capable spatial voxelization for correspondence matching, along with a spherical alignment-guided rotation registration to estimate vehicle rotation. By incorporating motion constraints into the optimization process, the algorithm enhances the rapid and effective estimation of LiDAR translation. Extensive experiments conducted across diverse environments demonstrate that ROLO-SLAM consistently achieves accurate pose estimation and outperforms existing state-of-the-art LiDAR SLAM solutions, making it a reliable choice for ground vehicle localization in perceptually-challenging environments. ROLO requires an input point cloud of type sensor_msgs::PointCloud2 . ROLO-SLAM mitigates vertical pose drift by dividing the front-end into three modules: forward location prediction for coarse translation estimation, voxelization matching for precise rotation estimation, and continuous-time translation estimation for improved accuracy. The back-end integrates scan-to-submap alignment and global factor graph optimization to enhance overall localization performance in challenging terrains. #ros #ros2 #opensource #robot #robotics #lidar #odometry #slam #groundvehicle #lightweight #robust #navigation [1] https://lnkd.in/dE7x4AuS
-
cuVSLAM: Accelerating Real-Time Visual SLAM With GPU Power In this episode of Artificial Intelligence: Papers and Concepts, we explore cuVSLAM, NVIDIA’s GPU-accelerated solution for visual simultaneous localization and mapping (SLAM). Designed for real-time applications like robotics, AR/VR, and autonomous systems, cuVSLAM enables machines to understand their position and map their surroundings efficiently using visual input. We break down why SLAM has traditionally been computationally intensive, how GPU acceleration transforms performance and scalability, and what this means for deploying real-time spatial intelligence in production environments. If you’re interested in robotics, computer vision, or real-time AI systems, this episode explains why cuVSLAM represents a major step forward in making high-performance mapping and localization more accessible and efficient. Resources: Paper Link: https://lnkd.in/gAbBcx-t Interested in Computer Vision and AI consulting and product development services? Email us at contact@bigvision.ai or visit us at https://bigvision.ai