With recent advances in AI-based autonomous driving technology, accurate perception and stable decision-making based on spatio-temporal sensor data have become essential for mobile robots operating in complex and unstructured environments. However, the deep learning and spatio-temporal processing algorithms that enable these capabilities are dominated by memory-intensive and compute-intensive operations. Conventional general-purpose computing platforms struggle to satisfy these requirements under strict real-time and power constraints. This paper presents two energy-efficient domain-specific processors for autonomous mobile robots through hardware–algorithm co-design: a semantic LiDAR SLAM processor (LSPU) and a multi-modal end-to-end driving processor (ABNP), both fabricated in 28-nm CMOS technology. The first processor, LSPU, is a dedicated accelerator for real-time semantic LiDAR SLAM that integrates spatial perception, localization, and mapping into a single hardware platform. To efficiently support the heterogeneous workloads of k-nearest neighbor (kNN) search, point neural network (PNN) inference, and non-linear optimization, the LSPU adopts a heterogeneous multi-core architecture composed of five specialized accelerators. A LiDAR-optimized spherical bin–based searching scheme combined with spatio-temporal-aware computing alleviates external memory bottlenecks in kNN operations. In addition, a two-stage global point-level task scheduler improves PNN core utilization under irregular point distributions. The LSPU further incorporates a dynamic point removal–based keypoint extraction core and a reconfigurable non-linear optimization core that supports keypoint-level pipelining and parallel matrix computation. Consequently, the LSPU achieves real-time semantic LiDAR SLAM with a processing latency of 20.7 ms and an energy consumption of 17.48 mJ per frame. The second processor, ABNP, accelerates the complete end-to-end autonomous driving pipeline, from multi-modal sensor fusion to planning and control. To handle severe layer-wise sparsity variations in hybrid CNN/Transformer–based multi-modal driving models, ABNP introduces a sparsity reasoning unit that exploits spatio-temporal correlations across past, current, and predicted future frames to dynamically reorganize heterogeneous sparse and dense cores. Moreover, a long-/short-term memory unit with progressive memory pruning significantly reduces the external memory overhead of temporal attention. As a result, ABNP achieves real-time end-to-end driving at 10.3 frames per second with an energy consumption of 71.3 mJ per frame, demonstrating 218× higher energy efficiency than a state-of-the-art autonomous driving computing platform.
Publisher
Ulsan National Institute of Science and Technology