UAVs have become a valuable platform for specific tasks such as inspection, mobile manipulation, surveillance, aerial photography and mapping due to their dexterity while flying [kocer2019inspection, mahmoud2014linear]. In 2017, the Land Transport Authority (LTA) in Singapore has researched intensively on the use of UAV technology for more efficient and flexible tunnel inspections [lta_drones]. To perform these tasks, it is possible to leverage model-based [kocer2019aerial] and/or model-free control approaches [hady2019real]. However, the localization plays a more crucial role in determining the current position and the orientation of the aerial robot with respect to a reference frame. In recent years, various algorithmic approaches have been presented for the mobile robots’ localization, combining different sensor configurations, such as visual-based localization such as a stereo camera and optical flow sensor, laser-based localization and GPS [Skog2016]. Outdoor localization can be easily achieved using GPS while indoor localization mainly relies on a stereo camera, ultra-wideband technology, radio waves or beacons configuration [hening20173d, zhen2019estimating]. However, these methods have difficulties for navigation in a tunnel environment due to the following reasons: poor light condition, featureless environment, echo interference, and GPS-denied.
For a potential localization in a tunnel environment, simultaneous localization and mapping (SLAM) technique can be a viable option. Some available approaches aim to explore the use of thermal and visual camera information [khattak2019visual]. With an onboard illumination and semi-autonomous setting, the use of laser scanners together with cameras are used in [ozaslan2017autonomous]. In a similar setting, the combination of stereo cameras and laser scanners is proposed recently in [quenzel2019autonomous] for a chimney inspection. Specifically, SLAM using cameras is referred to visual-based SLAM (vSLAM) which is based on visual information only while SLAM using the LIDAR sensor is referred to as laser-based SLAM which relies on laser scan information [Taketomi2017]
. The laser-based SLAM technique might be superior to vSLAM in an indoor environment (e.g. deep tunnel system) where the ambient light condition is not optimum. Hence, this paper presents a comparison of potential laser-based SLAM techniques considering our desired tolerance of 20cm with a two-dimensional (2D) LIDAR sensor as the main perception input. For the vertical axis, the system is endowed by TFMini, which is a 1D Lidar sensor. An onboard IMU is used to further improve both reliability and accuracy of pose estimation[henawy2019accurate] by eliminating unusable laser scan (caused by rolling and pitching).
Three potential approaches, namely Hector SLAM, Gmapping, and Cartographer, are implemented and configured to be tested on the UAV platform. The mapping and localization performances are then compared with ground truth data in the motion capture lab, and the pose estimation error of both approaches are evaluated and discussed in the paper. The evaluations from this study might potentially determine which approach is more suitable for aerial robot localization in an indoor environment such as a deep tunnel system.
Ii-a Localization and Mapping
Localization of a mobile robot is required to determine the pose information with various sensor configurations within an environment based on an algorithm. For example, LIDARs, ultrasonic sensors, stereo cameras are common configurations for the localization. For autonomous systems, SLAM is introduced as the most widely researched topic. It can be useful for creating and updating maps within an unknown environment, while keep tracking the position of the mobile robot instantaneously. Therefore, there has been extensive research into SLAM algorithms, with reliably working solutions for typical indoor and outdoor environments using particle filters as in Gmapping. Most of them are being available as open software for individual and collaborative development [Santos2013, Li2014, Kohlbrecher2011, Tardioli2014, Nikolov2017, Peel2018].
Ii-B LIDAR Selection
LIDAR is a remote sensing method that uses light in the form of a pulsed laser beam to measure ranges (variable distance). The capabilities of the LIDAR sensor are critical in our project since it serves as the main sensing unit for the localization algorithm. Therefore, some potential LIDAR sensors are shortlisted in Table I, presenting their characteristics including detecting range, the field of view (FoV), and scanning frequency.
|Model||Range (m)||FoV (deg)||Frequency (Hz)|
|RPLidar A1||[0.15, 12]||360||10|
|RPLidar A2||[0.20, 12]||360||10|
|Hokuyo URG-04LX||[0.06, 4]||240||10|
|Hokuyo UTM-30LX||[0.10, 30]||270||36|
In this study, an omnidirectional laser scan is desired to detect the features in a tunnel environment. The detecting range of the sensor must be able to reach the width of the sewerage tunnel with 6m wide, similar to the case in [tan2018smart]. After some comparisons and considerations, RPLidar A1 was selected because it is a low-cost 360-degree laser scanner with a scanning rate of 10Hz.
Iii Potential 2D SLAM Techniques
This section discusses the characteristic of chosen SLAM techniques, the configuration and fine-tuning of them to perform seamlessly with our LIDAR platform. To accomplish this goal, a personal laptop is used to perform software implementation, with the following specifications: (i) Intel Core i5-4210U CPU@1.7GHz quad-core; (ii) 8GB RAM; and (iii) NVIDIA 840M GPU.
Iii-a Hector SLAM
Hector SLAM111www.ros.org/wiki/hector_slam incorporates with 2D LIDAR sensor to generate a map from the laser scan. In contrast to other SLAM techniques (e.g. Gmapping), Hector SLAM does not require any auxiliary odometry sensor (e.g. wheel encoders) which directly measures the travel distance of a land-based robot, but only relies on the information from the laser scan matching approaches. Therefore, the Hector SLAM is more suitable for aerial vehicles. The Hector SLAM takes advantage of the low distance measurement noise and high sampling rates of LIDAR for a fast scan-matching method [Kohlbrecher2011]. Another advantage of the Hector SLAM is its capability to generate multi-resolution grid maps to avoid singularity during scan matching.
A map can be generated by the Hector SLAM according to the endpoints of the laser beams hit onto the walls. Then, the transformation of the current scan is determined by the Gauss-Newton approach, which finds the best alignment of the current scan to the map generated previously.
Gmapping222www.ros.org/wiki/gmapping is a laser-based SLAM algorithm, which uses a Rao-Blackwellized Particle Filter SLAM approach. It is one of the most widely used SLAM methods in robotics, especially for land-based mobile robots. In general, the particle filter family of the algorithm requires high sampling particles to obtain accurate results, therefore it might have relatively increased computational complexity. Also, the depletion problem associated with this method decreases the algorithm accuracy. This arises when the elimination of a large number of particles from a sample set during the resampling step. In this context, an adaptive resampling technique has been developed to minimize the depletion problem since the resampling process is only performed when it is required. Moreover, this approach takes into account not only the movement of the mobile robot but also the most recent sensor observation with odometry information; therefore decreasing the uncertainty for the robot’s pose in the particle filter’s prediction step. As a result, the number of particles required is significantly reduced since the uncertainty is lower, due to the quality of the laser scan matching process. In our experiment, the number of particles used is set to the default value of 30.
is an active approach that provides real-time SLAM in 2D and 3D across multiple platforms and sensor configurations. It is an open-source library, developed by Google since 2016, which is also a state of art algorithm. Worth to mention, Google Cartographer does not require a particle filter algorithm for mapping. It overcomes the issue of error accumulation during long iterations by pose estimation against a recent submap.
In 2D SLAM, the Cartographer supports running the correlative scan matcher, which is used for finding loop closure constraints with a submap (at the best-estimated position) referred to as frames. In detail, scan matching occurs at a recent submap, therefore it only depends on the recent scans. After each submap is finished, there are no longer new scans that could be inserted; it automatically checks all submaps and scans for the loop closure. A scan matcher starts to find the scan in the submap if the scans and the submaps are close enough based on the current pose estimates [Yagfarov2018].
The conversion process from a scan into a submap is given in [Hess2016]
. The generated submaps are presented in the form of a probability grid point which contains all the endpoints of beams that are closest to that grid point. Whenever a scan is inserted into the probability grid; the hits and misses are computed. Cartographer uses the Ceres scan matching approach to increase the accuracy of the scan pose in the submap.
Iv Results and Discussion
All three potential SLAM techniques are implemented and tested through offline experiments. A hand-held experimental platform of the LIDAR sensor and IMU device is designed and the data collection is conducted to obtain the results for each approach. The indoor experiments are conducted in the Motion Analysis Laboratory which equipped with OptiTrack Motion Capture (Mocap) systems that allow vehicles to navigate with a less than centimeter accuracy. Mocap uses 8 off-board cameras to identify vehicle pose information (position and attitude) in a 3D space. It is an external system that tracks the position of reflective markers and provides its pose at 240 Hz. The ground truth data are received and recorded for the comparison.
There are two trajectories with two different speeds (normal walking speed and fast walking speed), with the starting point set as the origin of the coordinates system (center location), as described below.
Iv-1 Straight line trajectories
Starting from the origin, moving straight and then follow the rectangular path until it ends at the starting point. The heading of the LIDAR sensor is purposely remained facing forward (X-direction), thus avoid the potential noise due to the large yawing angle.
Iv-2 8-shape trajectories
Starting from the origin, moving along figure 8 with the heading aligned with the trajectory, ending at the starting point.
Iv-a Performance Analysis of Selected SLAM Approaches
A group of typical localization results in the indoor experiment is illustrated from the following aspects: (i) robot path; and (ii) yaw angle.
Iv-A1 Normal moving speed
Firstly, the results from the first case are illustrated in Fig. 1 and Fig. 2. All trajectories are plotted on the same graph to have a clearer comparison of the selected algorithms. In Fig. 1, the localization results from the Hector SLAM and Gmapping are comparably smoother than the outcome of the Cartographer which shows fluctuation and jerkiness. This result is possibly due to the characteristics of the Cartographer that fuses multiple sensor data, but there are only LiDAR sensors in our case. On the other hand, it is difficult to conclude that either the Hector SLAM or the Gmapping is more accurate only by this test.
According to our observation, the localization performance of the laser-based algorithm is affected by three variables, namely: the performance of the LIDAR sensor, the feature extraction within the environment and the matching algorithm.
From Fig. 2, a significant time delay is observed in the Gmapping algorithm, which also exhibits slightly discrete movement. This behavior is mainly because of the slower updating frequency of the Gmapping algorithm as compared to others. On the other hand, both Hector SLAM and Cartographer give reliable positioning results, but the latter algorithm shows fluctuation during each peak.
In the second case, the circular path is used to test the robustness of each SLAM technique when facing large rotating in the yaw angle. The results are shown in Fig. 3 and Fig. 4. Once again, Hector SLAM had better results as compared to the Cartographer which is wavy and the Gmapping which has a high time delay. Notably, the time shift of the Gmapping could be larger when the simulation time increases. We can also observe that there are two sudden peaks of the yaw estimation from both the Gmapping and the Cartographer packages.
Iv-A2 Fast moving speed
A similar experiment is conducted with a faster speed to test the robustness of each algorithm. As can be seen in Fig. 5, the Gmapping is not functioning properly. Hector SLAM has the best pose estimation where else the Cartographer generated a wobbly path. In Fig. 6, the ground-truth data update stops at some instants in the Mocap system due to the communication problems.
During the faster circular trajectory shown in Fig. 7, the Gmapping has failed to perform SLAM properly while the Cartographer generated a fluctuated data. Only the Hector SLAM’s trajectory has the closest trajectory as compared to the ground truth values.
Notably, the Cartographer experienced a sudden spike twice in determining the heading of our robot when the system rotates -120 degrees of yaw, shown in Fig. 8 (green line). There are also some stationary instants from ground-truth value during the experiment, this might be due to the blockade of reflective markers during the operation.
Iv-B Further Analysis of SLAM Performance
To further analyze the results, we use the root mean square formula (Eq. 1) to determine the accuracy of each approach compared to the ground truth.
where is the true displacement and is the estimated displacement.
To calculate the displacement between the true pose and the estimated pose:
where the subscript denotes the truth data, and the subscript denotes the estimated data. Henceforth, some conclusions on each technique are shown in Table II.
|Approach||Linear Nominal||Circular Nominal||Linear Fast||Circular Fast|
Firstly, the Hector SLAM algorithm relies only upon the laser scan matching without the use of odometry, which could be an advantage for our aerial robot platform. Apart from that, the Hector SLAM also provides accurate pose estimation, with an average error of 15.09cm in translation after taking average RMSE of all scenarios. On the other hand, the Gmapping shows its robustness only in slow motion situations, but the time delay accumulated over time. Most importantly, the Gmapping is malfunctioning in faster movements. Lastly, the Cartographer achieves the fastest computation time and had a decent average accuracy of 18.04cm in translation. Worth to mention, the Cartographer is designed for multiple sensors platform, therefore it is expected to obtain more accurate results in a multi-sensor based system.
Iv-C On-board Experiments
From the previous off-board experiments, the Hector SLAM is selected as an optimum SLAM algorithm considering our limitations for the sensor instrumentation. Except for 2D Lidar and onboard IMU, an additional distance sensor is needed to provide information for the altitude. Since it is tiny, low cost and consumes low power with a detecting range of 0.30m - 12m, TFmini Lidar is selected. It is configured with QGroundControl. Our UAV system is shown in Fig. 9, where Intel NUC serves as an on-board processing unit which receives laser scan data and utilizes the Hector SLAM algorithm meanwhile fusing the altitude (from TFmini) to generate real-time 6 DoF position information.
During real-time experiments, 4 scenarios are carried out:
Normal/fast speed circle path;
Normal/fast speed ‘8’ path.
The aerial robot is controlled via ROS for the desired trajectories. All the necessary nodes are running onboard. The recorded data are extracted and the displacement is calculated in 3D space, followed by the RMSE formula.
A summary of RMS errors in 3D translations for all cases is shown in Table III. Since the generated paths are similar to each other, only the 3D visualization of the normal speed circle path is given in Fig. 10.
Notably, the detecting range of TFmini is only from 0.30 to 12m, therefore any distance below 30cm will be considered as a minimum value of 0.30m. Since the placement of TFmini was 8cm offset below the CG of the drone, the effective detecting range of altitude is 0.38m to 12.08m. Henceforth, the limitation of this fusion method is the blind zone when altitude is under 38cm.
In summary, the 3D pose information is obtained with the fusion of the Hector SLAM and the TFmini sensor. According to the ground truth data, the RMSE has stayed below 20cm for all the cases.
In this work, three selected approaches were implemented on the aerial robot system endowed by a 2D Lidar for horizontal axes and 1D Lidar for the vertical axis. A set of offline experiments were conducted in different velocities and conditions to measure their tracking and mapping performances. From the results, it was concluded that the Hector SLAM package obtained a reasonable localization performance. Also, the on-board experiments showed that it was achieved to keep the RMSE below 20cm in 3D translation. At the same time, the Cartographer package is also preferable due to its potential with fusing different perception units.
In our future work, we intend to carry out the experiments in a real tunnel environment to obtain the actual environmental conditions. Some difficulties can be expected, for example, the rough surface and the humidity within a deep tunnel might not be optimum for a laser sensor. As a solution, the robustness of the laser-based algorithm can be improved by fusing multiple SLAM algorithms such as a visual-based SLAM that using a stereo camera to detect the features within an environment. This would potentially achieve the goal of autonomous UAV inspection in a deep tunnel system without human intervention. Moreover, different lidars (e.g., Velodyne and Hokuyo) are considered to be used in our research for further improvement.