A*SLAM: A Dual Fisheye Stereo Edge SLAM

by   Guoxuan Zhang, et al.

This paper proposes an A*SLAM system that features combining two sets of fisheye stereo cameras and taking the image edge as the SLAM features. The dual fisheye stereo camera sets cover the full environmental view of the SLAM system. From each fisheye stereo image pair, a panorama depth image can be directly extracted for initializing the SLAM feature. The edge feature is an illumination invariant feature. The paper presents a method of the edge-based simultaneous localization and mapping process using both the normal and inverted images interchangeably.



There are no comments yet.


page 2


Semantic Feature Matching for Robust Mapping in Agriculture

Visual Simultaneous Localization and Mapping (SLAM) systems are an essen...

High-resolution Ecosystem Mapping in Repetitive Environments Using Dual Camera SLAM

Structure from Motion (SfM) techniques are being increasingly used to cr...

Stereo Plane SLAM Based on Intersecting Lines

Plane feature is a kind of stable landmark to reduce drift error in SLAM...

Hybrid Feature Based SLAM Prototype

The development of data innovation as of late and the expanded limit, ha...

Degenerate Motions in Multicamera Cluster SLAM with Non-overlapping Fields of View

An analysis of the relative motion and point feature model configuration...

Edge SLAM: Edge Points Based Monocular Visual SLAM

Visual SLAM shows significant progress in recent years due to high atten...

IV-SLAM: Introspective Vision for Simultaneous Localization and Mapping

Existing solutions to visual simultaneous localization and mapping (V-SL...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In a complicated environment, all self-driving mobile systems need to equip with the vision sensors. Therefore, if we can implement a SLAM and navigation system purely depends on the vision sensors, then we can save a big amount of cost by excluding the expensive laser and/or LiDAR sensors.

A SLAM system is not just for the mapping and localization purpose, it also needs to support the robust navigation functionality. In a practical application, the issue of can SLAM should be replaced with robust SLAM nowadays. This short paper poses the problem of what are the necessary components for a robust visual SLAM system, and then introduce the A*SLAM system which meets all of these conditions.

Fig. 1: (a) In a navigation scenario, the robot is failed to verify the planned path because of a limited camera FoV. (b) A full surrounding view provided by dual fisheye camera configuration.

Ii Requisites for Robust Visual SLAM

A robust visual SLAM system needs to satisfy at least the following four conditions.

Ii-a Stereo Vision

A monocular SLAM can only build a map up to scale. A scale map is insufficient for a mobile system to perform meaningful work in the physical world. The motion control is based on the metric information, and the movements are represented by the metric unit. A stereo vision can add metric measurements to the map based on the baseline distance of the binocular imaging.

Another benefit of the stereo vision is the detection of depth value for each pixel by fast image matching along the predefined epipolar line. The depth information can be used for the initialization of the SLAM feature, and can also be used for obstacle detection in a navigation system.

Fig. 2: To map an area between A and B, the partial view camera needs to travel twice to form a complete map. When the robot makes a turn at the midpoint C, the map switching tends to be failed due to the mapping error.

Ii-B Wide Field of View

Wide field of view (FoV) means possible of integrating more spatial information in a SLAM computation. Seeing wider and more can usually improve the accuracy and the robustness of the feature tracking. Furthermore, the SLAM features near outer FoV can be easily converged because these features can utilize big parallax existing at the outer FoV area.

In a navigation scenario, a limited FoV will degrade the path planning performance. Figure 1(a) shows a robot (blue) is blocked by an obstacle (gray). After the robot planned a path (red), however, because the camera FoV (green) is not wide enough to verify the planned path, so the robot cannot convert this path to motion immediately.

Adopting the fisheye camera is considered as the simplest way to provide a wide FoV to a visual SLAM system.

Ii-C Full View

The concept of full view is an extension of wide FoV but it requires using more than one camera. Figure 1(b) shows a full surrounding view configuration using two fisheye cameras. Compared to a partial camera view, the full view provides even more robustness to the SLAM system. In a full view system, the occlusion problem becomes much more trivial than the single camera system.

Beyond the above-mentioned advantages, the main reason for using the full view configuration is the full view can generate a complete map. A complete map means a single scan can gather all 360-degree information into a map, hence, the same area does not need any further scan. As shown in Fig. 2, to map an area between A and B, a partial view camera needs to scan the area twice. After mapping, there exist two independent maps A-B and B-A. When a robot started from A and then made a turn at midpoint C, there is no guarantee the robot can smoothly switch from map A-B to map B-A due to the inevitable mapping error. For practical use, all visual SLAM systems should be comprised of the full view configuration in the future.

Fig. 3: Edges are illumination invariant features. In this image, the edges from the object contours preserve the shape information even under severe light changing conditions (©matel.tv).
Fig. 4: A panorama depth image can be directly extracted from a pair of fisheye stereo images. The depth information is used for initializing the SLAM features and the obstacle detection in a navigation system.

Ii-D Illunimation Invariant Feature

A SLAM map is a static one, it can only include a specific lighting condition when the map is built. By comparison, the environment is a dynamic one, the visual appearance suffers from the frequent change of lightings, weathers, and seasons. Existing popular SLAM methods, like [1] and [2], are all using the pixel value to compose the SLAM features. Accordingly, the performance of tracking the SLAM feature is greatly downgraded when the lighting condition is different from the registration time.

Edge is an illumination invariant feature. Edges are formed by the abrupt color transitions on a surface or the intersection of heterogeneous geometric structures in 3D space. As shown in Fig. 3, the edges can be robustly detected at the same location even under the severe change of the lighting conditions. Dealing edges as a pure geometric substance, without the supporting of the pixel values, is a very challenging problem in the SLAM community. Now we are proud to announce the problem of using the edge as the SLAM feature is solved.

Iii A*slam

A*SLAM system satisfies all previously described four conditions of a robust visual SLAM system. A*SLAM system features combining two sets of fisheye stereo cameras and taking the image edge as the SLAM features.

Among the dual stereo camera sets, one is looking forward and another one is looking backward. Each stereo camera is equipped with a pair of 180-degree fisheye lenses. In this way, the A*SLAM system is able to cover a full environmental view. The fisheye image can also be used to generate a wide-angle depth image. Figure 4 shows an example of using our developed CaliCam® [3] stereo camera to generate a panorama depth image [4].

By abstracting the image edge as a pure geometric substance, A*SLAM system enables a simultaneous localization and mapping process based on both the normal and inverted images interchangeably, as shown in Fig. 5 and a demonstration video clip [5].

Iv Conclusion

To the best of our knowledge, A*SLAM is the only system in the world to embrace all the challenging requisites of a robust visual SLAM system. We are now actively looking for all potential domains to apply our robust A*SLAM system for maximizing the SLAM and navigational performance.

Fig. 5: The bottom left part shows two inverted images taken from the front camera and the back camera, respectively. The localization and mapping process can be conducted with both the normal and inverted images interchangeably in the A*SLAM system.