## I Introduction

Accurate real-time state estimation and mapping are necessary capabilities for mobile robots to perceive, plan, and navigate through unknown environments. LiDAR-based localization has recently become a viable option for many mobile platforms, including drones, due to lighter and smaller sensors. As a result, researchers have recently developed several new LiDAR odometry (LO) and LiDAR-inertial odometry (LIO) algorithms which often outperform vision-based approaches due to the LiDAR’s superior range and depth measurement accuracy. However, there are still fundamental challenges in developing reliable and accurate LO/LIO algorithms [2], especially for robots that execute agile maneuvers. In particular, using high-rate IMU measurements in LO/LIO pipelines—which can help correct warped scans during aggressive motions—can yield brittle algorithms that are hyper-sensitive to noise, calibration, or choice of parameters used by the backend solver. This work presents a hybrid LO/LIO framework that includes several algorithmic and architectural innovations to improve localization accuracy and map clarity by safely utilizing IMU measurements in a LiDAR-centric pipeline.

Combining LiDAR scans and IMU measurements is generally done in a loosely-coupled [21, 16, 10, 17, 3] or tightly-coupled [14, 18, 20, 9] manner.
In loosely-coupled methods (i.e., LO), measurements from a high-grade IMU are used as a prior for the point cloud alignment optimization or to correct motion-induced distortion in the LiDAR scan.
Due to their decoupled nature, LO algorithms are often quite robust [21].
Conversely, tightly-coupled methods (i.e., LIO) *explicitly* utilize IMU measurements in the point cloud alignment step resulting in more accurate trajectory estimates and maps.
Despite the improved accuracy, tightly-coupled methods are often hyper-sensitive to IMU noise and extrinsic calibration [14].
Real-time execution is also a challenge due to the large amount of data generated by the LiDAR and IMU.
A robust, real-time algorithm that combines the reliability of LO with the accuracy of LIO has yet to be developed.

In this work, we present Direct LiDAR-Inertial Odometry (DLIO), an accurate and reliable LiDAR-inertial odometry algorithm that provides fast localization and detailed 3D mapping (Fig. 1) through its hybrid LO/LIO architecture. There are four main contributions of this work. First, a novel interconnected architecture is proposed where an LO mapper and LIO pose graph fuser work in tandem to improve the performance and robustness of the other. Second, a high-fidelity motion model using high-rate IMU measurements is developed and incorporated into the proposed architecture to correct for motion-induced point cloud distortion, dramatically increasing the quality of localization and produced maps. Third, a significant reduction in computation is achieved through an efficient scan-matching approach for dense point clouds that utilizes high-rate state estimation to construct a full 6DOF prior, eliminating the common scan-to-scan alignment step. Finally, the efficacy of our approach is verified through extensive experimental results using benchmark, public, and self-collected datasets from around a university campus on consumer-grade and hobby-grade hardware and IMU sensors.

## Ii Related Work

Geometric LiDAR odometry algorithms rely on aligning point clouds by solving a nonlinear least-squares problem that minimizes the error across corresponding points and/or planes. To find point/plane correspondences, methods such as the iterative closest point (ICP) algorithm [1] or Generalized-ICP (GICP) [13] recursively match entities until alignment converges to a local minimum. Slow convergence time is often observed when determining correspondences for a large set of points, so feature-based methods [21, 16, 14, 15, 11, 19, 9, 20]

employ a precursor step which attempts to extract only the most salient data points, e.g., edge or planar points, in a scan to decrease computation time. However, the efficacy of feature extraction is highly dependent on specific implementation. Moreover, useful points are often discard resulting in less accurate estimates and maps. Conversely,

dense methods [10, 17, 3, 18] directly align acquired scans but often rely heavily on aggressive voxelization—a process that can remove or alter important data correspondences—to achieve real-time performance.LiDAR odometry approaches can also be broadly classified according to their method of incorporating other sensing modalities into the estimation pipeline.

Loosely-coupled methods [21, 16, 10, 17, 3] process data sequentially. For example, IMU measurements are used to augment LiDAR scan registration by providing an optimization prior. These methods are often quite robust due to the precision of LiDAR measurements, but localization results can be less accurate since only a subset of all available data is used for estimation. Tightly-coupled methods [14, 18, 20, 9], on the otherhand, can offer improved accuracy by jointly considering measurements from all sensing modalities. These methods employ either graph-based optimization [14, 20, 9]or a recursive filtering framework, e.g., Kalman filter

[19, 18]. However, poor extrinsic calibration or high IMU noise can result in severe or catastrophic localization error due to bad scan placement for subsequent iterations in scan-matching.Incorporating additional sensors can aid in correcting motion-induced point cloud distortion. For example, LOAM [21]

estimated sensor velocity either through scan-matching-based techniques or a loosely-coupled IMU and modeled inter-sweep motion with a constant angular and linear velocity assumption. Point clouds were subsequently deskewed via linear interpolation depending on each point’s timestamp. Similarly, LIO-SAM

[14] formulated LiDAR-inertial odometry atop a factor graph to jointly optimize for body velocity. In their implementation, points were motion corrected by interpolating rotational movement via a ratio between each point’s relative timestamp and the total sweep time. Other works have proposed potentially more accurate methods [4, 12], but they have not incorporated them into a fully self-contained pipeline.To address the aforementioned shortcomings, DLIO proposes a dual LO/LIO architecture to specifically target the mapping and state estimation problems separately. DLIO contains the robustness of loosely-coupled approaches in that errors from fusing noisy IMU measurements do not propagate over time, which dramatically decreases the reliance on precise parameter calibration and relies more heavily on the accurate LiDAR measurements for map generation. Likewise, DLIO couples LiDAR scan-matching with IMU preintegration [5] through a factor graph [8] and IMU propagation for high-rate state estimation, providing the mapping module with up-to-date velocity and bias estimates for motion correction and scan-matching initialization. The key to our approach is combining the advantages of both loosely-coupled and tightly-coupled schemes to tackle the map building and state estimation problems in parallel. Moreover, this arrangement enables DLIO to perform more precise point cloud motion correction without any simplifying assumption on the motion during the LiDAR scan, e.g., constant velocity.

## Iii Method

### Iii-a System Overview

DLIO is a hybrid LO/LIO architecture that generates geometric maps and robot state estimates through two main components (Fig. 2). The first is a fast LiDAR odometry scan-matcher which registers a dense, motion-corrected point cloud into the robot’s global map by performing alignment with an extracted local submap via a custom GICP-based optimizer with a preconstructed prior. In the second, state estimation is modeled using a factor graph with Gaussian noise and fuses the first component’s output with a preintegrated IMU factor for state estimation at LiDAR rate (10-20Hz). This is subsequently propagated through a high-rate IMU integrator for state estimation at IMU rate (100-500Hz), whose output contains the final pose, velocity, and biases. These terms are then used to correct the next IMU measurements and initializes integration for the next iteration’s motion correction and scan-matching.

### Iii-B Notation

Let the point cloud for a single LiDAR sweep started at time be denoted as . The point cloud is composed of points that are measured at a time relative to the start of the scan and indexed by where is the total number of points in the scan. The world frame is denoted as and the robot’s base link as with the convention that points forward, left, and up. The IMU sensor’s coordinate system is denoted as and the LiDAR’s as , where and coincide translationally but not necessarily rotationally; and

do not necessarily coincide. The robot’s state vector

at time is defined as(1) |

where is the robot’s position, is the orientation encoded by a four vector quaternion on under Hamilton notation, is the body velocity, is the accelerometer’s bias, and is the gyroscope’s bias. Measured body linear acceleration and angular velocity from by an IMU are modeled as

(2) | ||||

(3) |

where and are the raw sensor measurements with bias

and white noise

, and is the gravity vector. In this work, we address the following problem: given a new LiDAR scan and IMU measurements and , estimate the robot’s current state and global map .### Iii-C Preprocessing

The inputs to DLIO are a dense 3D point cloud collected by a modern 360 LiDAR, such as an Ouster or a Velodyne (10-20Hz), in addition to linear acceleration and angular velocity measurements from a 6-axis IMU at a much higher rate (100-500Hz). To minimize information loss, we do not preprocess the point cloud at all except for a box filter of size 1m around the origin which removes points that may be from the robot itself. This distinguishes our work from others that either attempt to detect features (e.g., edges and corners) or heavily downsample the cloud through a voxel filter. On average, the point clouds used in this work contained 16,000 points. Note that for slower processors or at higher resolutions, voxelization may be beneficial for more consistent performance. No additional processing of IMU data was performed.

### Iii-D Motion Correction

Point clouds from spinning LiDAR sensors suffer from motion distortion during movement since the rotating laser array collects points at different times. Rather than assuming a constant velocity model throughout a sweep and deskewing the point cloud via linear interpolation that previous works have done [21, 14, 11]

, we instead employ a more accurate motion model which corrects the point cloud by transforming each skewed point into a coordinate system that approximates its measured frame (Fig.

3).Let be the clock time of the received point cloud at discrete time which corresponds to the beginning of the sweep, i.e. the timestamp when the first point was acquired, and let be the timestamp of point , which is relative to . In a distorted scan, all points in the cloud are associated with the same frame, but each point is in fact measured somewhere along the trajectory between the beginning of the sweep to the end and therefore should be compensated accordingly. To infer these measurement locations, we integrate the linear acceleration and angular velocity measurements between and such that

(4) |

where and is the number of IMU measurements between two LiDAR scans. The frames are then defined as homogeneous transformations that correspond to and , the rotation matrix corresponding to . has an associated clock timestamp from the corresponding IMU measurement’s stamp, and point can be deskewed via , where is the transformation with timestamp , and

(5) |

for each point. Importantly, the accuracy of Eq. (5) depends on the accuracy of , the initial estimate of body velocity before integration, and and , the estimated IMU biases, at the time of motion correction. To enforce this, state estimation should occur as frequently as possible lest the velocity and bias terms become outdated during fast or sudden maneuvers, which would ultimately result in inaccurate or poor motion correction.

### Iii-E Direct LiDAR Odometry

Motion-correct point clouds are aligned to a local submap using Direct LiDAR Odometry (DLO), a fast keyframe-based frontend LiDAR odometry solution derived from our previous work [3] but with several notable improvements. First, rather than just a rotational component as in [3], we additionally construct a translational prior through IMU integration to initialize the subsequent scan-to-map optimization. Scan-to-map optimization for an accurate translational estimate can be infeasible with dense scans due to the large number of points, so previous dense methods have relied on a “scan-to-scan” module prior to map registration [17, 10]. However, DLIO’s ability to estimate velocity and sensor bias at a high rate eliminates this requirement for dense methods and further reduces overall overhead. Furthermore, we enhance DLO’s adaptability by computing an additional point cloud “density” metric which scales our GICP solver’s maximum correspondence distance and provides consistent performance in both small and large-scale environments without the need to parameter tune. A third enhancement is with respect to our multithreaded algorithmic implementation to reduce spikes in computational load during significant environmental changes and is detailed in Section III-G2.

#### Iii-E1 Optimization Prior

Point cloud alignment via GICP is cast as a nonlinear optimization problem, and the process can be initialized with a transformation prior to increase convergence speed and decrease the chances of converging into a sub-optimal local minimum. This prior represents an initial guess before being refined by scan-matching. With DLIO, both rotational and translational components can be retrieved by integrating linear acceleration and angular velocity measurements from an IMU. The procedure follows similarly to that of Eqs. (4) but with the notable difference of integrating measurements between and such that

(6) |

for number of measurements between and , where and is initialized as the identity quaternion, and the velocity at each step can be calculated as where is the estimated body velocity during the acquisition of the previous scan. Then the scan-to-map pose prior can be computed via , where is the matrix corresponding to and and is the output of scan-matching from the previous time step.

#### Iii-E2 Scan-to-Map Matching

Since dead reckoning from an IMU is unreliable, we refine by registering the current scan onto a locally extracted submap to be more globally consistent via a custom GICP-based plane-to-plane solver. Let be the motion corrected LiDAR scan and be a local keyframe-based submap extracted via nearest neighbor and convex hull on keyframe positions as described in [3]. Then, the objective of the scan-to-map optimization is

(7) |

where the GICP plane-to-plane residual error is defined as

for a set of corresponding points between and at timestep , , , , , and and are the estimated covariance matrices for point cloud and local submap , respectively.

An important parameter that is often overlooked is the maximum distance at which corresponding points or planes should be considered in the optimization. This parameter is often hand-tuned by the user but should scale with the environmental structure for consistency and computational efficiency. For example, in small-scale environments (e.g., a lab room), points in the LiDAR scan are much closer together so a small movement has a small effect on the displacement of a given point. In contrast, the same point in a more open environment (e.g., a point on a distant tree outside) will be displaced farther with a small rotational movement due to a larger distance and therefore needs a greater search radius for correct correspondence matching. Thus, we set the GICP solver’s maximum correspondence search distance between two point clouds according to the “density” of the current scan, defined as , where is the average per-point density, is the average Euclidean distance to nearest neighbors for point , and and are smoothing constants to produce , the filtered signal set as the final maximum correspondence distance.

### Iii-F High-Rate State Estimation

DLIO’s second component fuses the pose estimate from DLO with high-rate IMU measurements to estimate the robot’s full state . Sensor fusion is modeled as a maximum a posteriori (MAP) estimation problem with Gaussian noise through a pose graph. A node is added for every pose from DLO, and the graph’s output is the robot’s full state with pose, velocity, and sensor bias at an output rate matching LiDAR frequency. Since accurate motion correction and scan-matching priors require the current best estimate of the state, a high-rate propagator is employed that integrates newly acquired linear acceleration and angular velocity measurements from the last output of the graph.

#### Iii-F1 Pose Graph Optimization

DLIO’s graph consists of two types factors to generate a value which represents the robot’s estimated state (Fig. 4). We compute and add additional nodes to the graph for every acquired LiDAR pose using the loosely-coupled transformation from DLO as the LiDAR odometry factor. For the IMU preintegration factor, we reuse Eq. (6) in which linear acceleration and angular velocity measurements between and are integrated for a relative pose and velocity body motion between two point cloud scans, where , for number of measurements between and . The outputs , , and are then transformed into via and added to the graph as the IMU factor. This IMU preintegration factor in addition to the LiDAR odometry factor are then jointly optimized with IMU biases to solve for the full state . To maintain computational tractability, we marginalize the pose graph after a set number of nodes.

#### Iii-F2 IMU Propagation and Feedback

Since DLIO adds nodes for every iteration of scan-matching, the rate of state estimation output matches that of the LiDAR; however, this rate is not sufficiently high for fast or aggressive motions due to poor motion correction and scan-matching initialization. Therefore, we propagate new IMU measurements atop the pose graph optimization’s low-rate output for high-rate, up-to-date state estimation, . This is DLIO’s final robot state, where velocity and bias terms are fed back to correct subsequent IMU measurements and used to initialize the next iteration of deskewing and scan-matching. Additionally, for sufficiently high deviation between the factor graph’s output and DLO’s scan-matching pose, we reinitialize the graph to provide further resiliency against poor sensor fusion. This ensures that the posterior distribution is well-constrained, i.e., cannot grow unbounded from noisy measurements or accumulated numerical errors.

### Iii-G Algorithmic Implementation

This section highlights two important implementation details of DLIO, namely sensor time synchronization and resource management for consistent computational load.

#### Iii-G1 Time Synchronization

Time synchronization is a critical element in odometry algorithms which utilize sensors that have their own internal clock.
This is necessary as it permits time-based data association to temporally align IMU measurements and LiDAR scans.
There are three clock sources in DLIO: one each for the LIDAR, IMU, and processing computer.
Hardware-based time synchronization—where the acquisition of a LiDAR scan is triggered from an external source—is not compatible with existing spinning LiDARs since starting and stopping the rotation assembly can lead to inconsistent data acquisition and timing.
As a result, we developed a software-based approach that compensates for the offset between the LiDAR (IMU) clock and the processing computer clock.
When the first LiDAR (IMU) packet is received, the processing computer records its current clock time and the time the measurement was taken on the sensor . Then, each subsequent measurement has a time with respect to the processing computer clock given by , where is the time the measurement was taken on the sensor.
This approach was found to work well in practice despite its inability to observe the transportation delay of sending the first measurement over the wire.
The satisfactory performance was attributed to using the elapsed *sensor* time to compute the compensated measurement time since a sensor’s clock is generally more accurate than that of the processing computer.

#### Iii-G2 Submap Multithreading

Fast and consistent computation time is essential for ensuring that incoming LiDAR scans are not dropped, especially on resource-constrained platforms. To this end, DLIO offloads work not immediately relevant to the current scan to a separate thread which minimally interferes with its parent thread as it handles further incoming scans. Thus, the main point cloud processing thread has lower, more consistent computation times. The secondary thread builds the local submap kdtree used for scan-matching and builds data structures corresponding to each keyframe which are needed by the submap (see details in [3]). This thread can finish at any time without affecting DLIO’s ability to accept new LiDAR scans. Additionally, it periodically checks if the main thread is running so it can pause itself and free up resources that may be needed by the highly-parallelized scan-matching algorithm. Crucially, the submap changes at a much slower rate than the LiDAR sensor rate, and there is no strict deadline for when the new submap must be built. Therefore, the effect of this thread—which is to occasionally delay the introduction of the new submap by a few scan iterations—has negligible impact on performance.

## Iv Results

The performance of DLIO was evaluated using data from the KITTI benchmark dataset [6], public datasets provided by LIO-SAM [14], and data self-collected around the UCLA campus using our platform (Fig. 1A). Specific details about each dataset are described in their corresponding sections. We compare accuracy and efficiency against LIO-SAM [14] and FAST-LIO2 [18], two current state-of-the-art LiDAR-inertial odometry algorithms.
Default configuration and parameters for LIO-SAM and FAST-LIO2^{1}^{1}1Results for FAST-LIO2 have been updated to incorporate a recent fix that converts the point timestamps of the incoming cloud to the correct units. at the time of writing were used in all experiments unless otherwise noted. Point intensity color range was also slightly adjusted to improve visualization. Note that LIO-SAM required a minor modification to work with our 6-axis IMU.
All tests were conducted on an Intel i7-11800H (16 cores @ 2.30GHz) CPU, except for Section IV-E in which data was processed on a Raspberry Pi 4 Model B to showcase the potential of using DLIO on small single-board computers.

### Iv-a Motion Correction Experiments

We first demonstrate the accuracy of our proposed point cloud deskewing method through two experiments that involve highly aggressive maneuvers (Fig. 5). In the first experiment, an indoor lab room was mapped with angular velocities reaching up to 360/s. DLIO is able to compensate for the agile motions resulting in fine map detail (e.g., ladder edges) and sign legibility. In the second experiment, we used the LIO-SAM Garden dataset which contained a segment of motions up to 160/s. DLIO can capture minute branch detail that is otherwise lost with simple or no motion correction.

Method | 2011-09-30 Sequence #28 | |||
---|---|---|---|---|

Absolute Pose Error [m] | RMSE [m] | |||

max | mean | std | ||

LIO-SAM [14] (LC) | 59.87 | 5.73 | 6.29 | 8.51 |

LIO-SAM [14] (no LC) | 65.16 | 14.67 | 9.35 | 17.40 |

FAST-LIO2 [18] | 68.77 | 10.82 | 9.04 | 14.10 |

DLIO | 58.78 | 5.27 | 3.99 | 6.22 |

Method | End-to-End Translational Error [m] | Average Comp. Time [ms] | ||||
---|---|---|---|---|---|---|

Garden (m) | Rooftop (m) | Rotation (m) | Garden | Rooftop | Rotation | |

LIO-SAM [14] (LC) | 0.0396 | 0.0466 | 0.4492 | 27.01 | 29.31 | 16.12 |

LIO-SAM [14] (no LC) | 0.0413 | 0.0709 | 0.4563 | 26.75 | 29.13 | 16.29 |

FAST-LIO2 [18] | 0.1375 | 0.2027 | 1.0712 | 14.73 | 17.71 | 12.35 |

DLIO | 0.0274 | 0.0289 | 0.2414 | 10.85 | 18.66 | 13.79 |

### Iv-B KITTI Benchmark Dataset

We evaluated DLIO, FAST-LIO2, and LIO-SAM using data from the KITTI [6] benchmark dataset. Specifically, we used Sequence 28 of the 2011-09-30 collection, a nine minute traversal over a total trajectory length of 4204.47m. Ground truth was from provided GPS, which was converted into a pose then aligned with corresponding trajectories via evo [7]. Note that motion distortion was not possible for any method as the dataset did not contain point-wise timestamps. The results can be seen in Table I

, in which DLIO had the lowest root-mean-square error and the smallest max, mean, and standard deviation for the absolute pose error.

### Iv-C LIO-SAM Dataset

Additional testing was conducted with the Garden, Rooftop, and Rotation datasets (Fig. 6) from LIO-SAM [14]. These datasets started and stopped at the same spatial location except for the Rotation dataset which ended nearby but not exactly at the same spot; we included this to demonstrate DLIO’s ability to track even though pure rotational movements (see [14] for more details; corresponding column not bolded for fairness). Table II provides the quantitative results calculating the end-to-end error for all algorithms, including LIO-SAM with and without loop closures (LC), in addition to average per-scan time. Trajectory length is indicated in parentheses and was estimated by DLIO for reference.

DLIO provided the shortest resulting end-to-end translational error for all three datasets, with less than 3cm for both the Garden and Rooftop experiments. While LIO-SAM (both with and without loop closures) ended with similar accuracies as DLIO, its average per-scan comptuation time was significantly higher than DLIO’s—even with feature extraction and voxelization (neither of which DLIO employ)—with an average difference of 9.71ms across all datasets. FAST-LIO2 had faster computation times for Rooftop and Rotation, but it produced substantially greater errors and slightly blurrier maps for all three datasets.

### Iv-D UCLA Campus Dataset

Finally, we collected four large-scale datasets around the UCLA campus for additional comparison (Fig. 7). These datasets were gathered by hand-carrying our aerial platform (Fig. 1) over 2261.37m of total trajectory with an average of 565.34m per dataset. Our sensor suite included an Ouster OS1 (10Hz, 32 channels recorded with a 512 horizontal resolution) and a 6-axis InvenSense MPU-6050 IMU located approximately 0.1m below it. We note here that this IMU can be purchased for $10, proving that LiDAR-inertial odometry algorithms need not require expensive IMU sensors that previous works have used. We additionally tested using the Ouster’s internal 6-axis IMU with similar results but have not included them for brevity. For this dataset, we also compared against DLO [3], DLIO’s predecessor, for reference and comparison against a purely frontend solution; this was not possible for previous datasets due to DLO’s initial stationary calibration procedure. End-to-end translational error and average per-scan time for all algorithms across this dataset are reported in Table IV.

Method | Error [m] | Avg Comp. [ms] | ||
---|---|---|---|---|

D1 | D2 | D1 | D2 | |

LIO-SAM [14] (LC) | 0.0346 | 0.0453 | 605.16 | 509.37 |

LIO-SAM [14] (no LC) | 0.0342 | 0.0304 | 564.61 | 506.18 |

FAST-LIO2 [18] | 0.1130 | 0.4200 | 65.91 | 88.83 |

DLIO (0.25 voxel) | 0.0331 | 0.0155 | 74.24 | 112.02 |

DLIO (0.50 voxel) | 0.0437 | 0.0208 | 44.53 | 67.16 |

With the exception of one per-scan time, DLIO outperformed all others across the board in both end-to-end translational error and per-scan efficiency for this dataset. On average, DLIO was 49.58% more accurate than LIO-SAM with loop closures, 62.59% more accurate than LIO-SAM without loop closures, and 85.04% more accurate than FAST-LIO2. Regarding average processing time, DLIO was on average 53.02% faster than LIO-SAM with loop closures, 55.18% faster than LIO-SAM without loop closures, and 2.54% faster than FAST-LIO2. DLIO’s resulting maps also had a higher density of points that captured fine details in the environment; this can provide more environmental information (i.e., traversability) for autonomous mobile robots. The advantages of DLIO over its predecessor, DLO, are also clear: state estimation is more reliable (e.g., column B) and maps contain much higher fidelity than before while at a lower overall computational overhead.

Method | End-to-End Translational Error [m] | Average Comp. Time [ms] | ||||||
---|---|---|---|---|---|---|---|---|

A (m) | B (m) | C (m) | D (m) | A | B | C | D | |

LIO-SAM [14] (LC) | 0.0216 | 0.0692 | 0.0936 | 0.0249 | 33.21 | 29.14 | 39.04 | 48.94 |

LIO-SAM [14] (no LC) | 0.0283 | 0.1185 | 0.1312 | 0.0285 | 33.82 | 28.83 | 40.76 | 48.89 |

FAST-LIO2 [18] | 0.2877 | 0.1284 | 0.0950 | 0.3502 | 16.69 | 17.62 | 17.21 | 19.05 |

DLO [3] | 0.0216 | 1.2932 | 0.0375 | 0.0178 | 20.40 | 20.77 | 21.18 | 21.62 |

DLIO | 0.0197 | 0.0285 | 0.0265 | 0.0102 | 17.14 | 15.95 | 16.78 | 18.88 |

### Iv-E Raspberry Pi Experiments

Tests on a Raspberry Pi 4 Model B (4-core, 8 GB) were conducted to demonstrate DLIO’s potential for running on lightweight and low-power single-board computers using the LIO-SAM Garden (D1) and the UCLA Sculpture Garden (D2) datasets (Table III). In these tests, DLIO was configured with more reasonable parameters for a hobby-grade processor. Specifically, point clouds were voxelized at 0.25m and 0.50m leaf sizes, and a smaller local submap and slightly relaxed convergence criteria for GICP were used. We observed long computation times from LIO-SAM leading to many dropped frames, and relatively poor accuracy from FAST-LIO2. DLIO dropped very few frames from the 10Hz LiDAR resulting in more accurate estimates and maps.

## V Conclusion

This work presented the Direct LiDAR-Inertial Odometry (DLIO) algorithm, a highly reliable LiDAR-centric odometry framework that yields accurate state estimates and detailed maps in real-time for resource-constrained mobile robots. The key innovation that distinguishes DLIO from others is its hybrid LO/LIO architecture that inherits the desirable attributes of both loosely-coupled and tightly-coupled approaches. This architecture also enables the use of a high-fidelity motion model for point cloud deskewing and a comprehensive scan-matching prior to produce a robust odometry algorithm. Experimental results demonstrated the improved accuracy, map clarity, and computational efficiency compared to the state-of-the-art using both consumer and hobby-grade hardware.
Future work includes closed-loop flight tests and adding loop closures.

Acknowledgements: The authors would like to thank Helene Levy for their help with data collection.

## References

- [1] (1992) Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, Vol. 1611, pp. 586–606. Cited by: §II.
- [2] (2016) Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Transactions on robotics 32 (6), pp. 1309–1332. Cited by: §I.
- [3] (2022) Direct lidar odometry: fast localization with dense point clouds. IEEE Robotics and Automation Letters 7 (2), pp. 2000–2007. External Links: Document Cited by: §I, §II, §II, §III-E2, §III-E, §III-G2, §IV-D, TABLE IV, 1.
- [4] (2021) Lidar scan registration robust to extreme motions. In Conference on Robots and Vision, Cited by: §II.
- [5] (2016) On-manifold preintegration for real-time visual–inertial odometry. IEEE Transactions on Robotics 33 (1), pp. 1–21. Cited by: §II.
- [6] (2013) Vision meets robotics: the kitti dataset. The International Journal of Robotics Research 32 (11), pp. 1231–1237. Cited by: §IV-B, TABLE I, §IV.
- [7] (2017) Evo: python package for the evaluation of odometry and slam.. Note: https://github.com/MichaelGrupp/evo Cited by: §IV-B.
- [8] (2012) ISAM2: incremental smoothing and mapping using the bayes tree. The International Journal of Robotics Research 31 (2), pp. 216–235. Cited by: §II.
- [9] (2021) MILIOM: tightly coupled multi-input lidar-inertia odometry and mapping. IEEE Robotics and Automation Letters 6 (3), pp. 5573–5580. Cited by: §I, §II, §II.
- [10] (2020) Locus: a multi-sensor lidar-centric solution for high-precision odometry and 3d mapping in real-time. IEEE Robotics and Automation Letters 6 (2). Cited by: §I, §II, §II, §III-E.
- [11] (2021) MULLS: versatile lidar slam via multi-metric linear least square. In IEEE International Conference on Robotics and Automation, pp. 11633–11640. Cited by: §II, §III-D.
- [12] (2020) Increased accuracy for fast moving lidars: correction of distorted point clouds. In IEEE International Instrumentation and Measurement Technology Conference, Cited by: §II.
- [13] (2009) Generalized-icp. In Robotics: science and systems, Vol. 2, pp. 435. Cited by: §II.
- [14] (2020) Lio-sam: tightly-coupled lidar inertial odometry via smoothing and mapping. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5135–5142. Cited by: §I, §II, §II, §II, Fig. 5, §III-D, Fig. 6, §IV-C, TABLE I, TABLE II, TABLE III, TABLE IV, §IV.
- [15] (2021) Lvi-sam: tightly-coupled lidar-visual-inertial odometry via smoothing and mapping. In IEEE International Conference on Robotics and Automation, pp. 5692–5698. Cited by: §II.
- [16] (2018) Lego-loam: lightweight and ground-optimized lidar odometry and mapping on variable terrain. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4758–4765. Cited by: §I, §II, §II.
- [17] (2020) LION: lidar-inertial observability-aware navigator for vision-denied environments. In International Symposium on Experimental Robotics, pp. 380–390. Cited by: §I, §II, §II, §III-E.
- [18] (2022) Fast-lio2: fast direct lidar-inertial odometry. IEEE Transactions on Robotics. Cited by: §I, §II, §II, Fig. 5, Fig. 6, TABLE I, TABLE II, TABLE III, TABLE IV, §IV.
- [19] (2021) Fast-lio: a fast, robust lidar-inertial odometry package by tightly-coupled iterated kalman filter. IEEE Robotics and Automation Letters 6 (2), pp. 3317–3324. Cited by: §II, §II.
- [20] (2019) Tightly coupled 3d lidar inertial odometry and mapping. In International Conference on Robotics and Automation, pp. 3144–3150. Cited by: §I, §II, §II.
- [21] (2014) LOAM: lidar odometry and mapping in real-time.. In Robotics: Science and Systems, Vol. 2, pp. 1–9. Cited by: §I, §II, §II, §II, §III-D.