Micro Aerial Vehicles (MAVs) are being deployed in a variety of application domains , such as search and rescue, industrial inspection, and cinematography. In particular, these applications require MAVs to safely navigate and avoid obstacles in the environments. Successful autonomous navigation of MAVs using on-board sensors has been demonstrated in static environments [8, 24, 5] or in controlled dynamic environments with an overhead motion capture system [27, 28].
The presence of moving obstacles requires a fast and efficient obstacle detection and tracking strategy to perform obstacle avoidance in real time. Moreover, obstacle sensing and MAV state estimation uncertainties should be accounted for to achieve robust collision avoidance. In this paper, we present an on-board vision-based approach for robust navigation of MAVs in dynamic environments. Our approach builds upon, and extends, a vision-based obstacle detection and tracking algorithm  and a model predictive controller (MPC)  to generate feasible and probabilistically safe trajectories for the MAV.
I-a Related Work
There has been a large amount of work in vision-based autonomous navigation for MAVs in unknown environments 
. Some early works demonstrate autonomous navigation by abstracting simple obstacle representations from camera images and adopting reactive heuristic collision avoidance techniques. maps obstacles as cylinders by detecting and tracking features of interest from images, then computes a reactive acceleration for collision avoidance. Similarly,  uses a monocular camera to avoid frontal obstacles by detecting relative size changes of image features and computing an avoidance velocity. However, image processing in those approaches is computationally heavy, which requires an off-board ground computer. In , the authors filter 3D point clouds from depth images into planes which are then used to compute the optimal collision avoidance direction with maximal open path length. In , the authors segment obstacles, modelled as ellipses, from a disparity map and plan a set of waypoints along the edge of obstacles using a heuristic collision checking algorithm. While the two approaches are shown to be fast, robot dynamics are not taken into account in the planner.
Recent works mainly rely on a similar pipeline where an environment map is built from image data and then used to plan collision-free motions for MAVs. In , the authors incrementally build a 3D global occupancy map on-board the MAV and use the VHF+ algorithm  for collision avoidance. Recent advances have lead to more efficient map representations than the occupancy map , including the ESDF map , the k-d tree , the NanoMap , the FIESTA map , etc. After building those maps, two main categories of methods are developed to plan collision-free motions. One is to use a library of pre-computed motion primitives  or funnels  and choose the best one from the library via collision checking. The other is to construct a collision-free flight corridor based a planned path obtained from a discrete planner such as A* and JPS , followed by trajectory optimization to generate dynamically feasible trajectories for the MAV [20, 9]. While those approaches have shown successful navigation of a MAV in a variety of environments, a common limitation of them is that they all assume the environments to be static without moving obstacles. Moreover, obstacle sensing uncertainty and MAV state estimation uncertainty are generally neglected.
The main contribution of this paper is an integrated system for collision avoidance of moving obstacles in dynamic environments. Obstacles are detected and their position, velocity and size are estimated from depth images and the generated U-depth maps (Section III). Chance-constraints are then formulated to account for the measured MAV state estimation and obstacle sensing uncertainty. These chance constraints are integrated in a model predictive controller to generate dynamically feasible and robust trajectories that keep the probability of collision below a specified threshold (Section IV). Finally, we demonstrate the system in real-world experiments to show its effectiveness (Section V).
Ii System Overview
Given a goal point, the MAV is required to plan and execute safe collision-free trajectories to navigate towards the goal while avoiding moving obstacles in the environment, based on its sensed stereo depth camera images.
Fig. 2 illustrates the proposed system for solving the problem, which consists of three main components: MAV state estimation, obstacle sensing and collision-free trajectory planning. In this paper, we focus on the last two components and achieve robust collision avoidance of moving obstacles. For state estimation, we rely on a visual-inertial odometry (VIO) method  to obtain the MAV pose and associated uncertainty. For obstacle sensing, we model obstacles as three-dimensional ellipsoids and adopt an efficient obstacle detection and tracking approach based on depth images, to obtain the obstacle size, position, velocity and associated uncertainty. For collision-free trajectory planning, we formulate a chance-constrained model predictive controller (CC-MPC) , taking into account the MAV state estimation and obstacle sensing uncertainty. The CC-MPC ensures that the collision probability between the MAV and each obstacle is below a specified threshold.
Iii Obstacle Detection and Tracking
In this section we describe our obstacle detection and tracking algorithm using depth images, as shown in Fig. 3. The algorithm is built on  where planar static obstacles are considered. We extend it to three-dimensional scenarios with moving obstacles.
Iii-a Obstacle Detection using Depth Images
We model each obstacle as a three-dimensional box with a yaw angle thus facing the camera of the MAV (Fig. 3a) and detect its position and size (length/thickness, width and height) based on the camera depth image. The length/thickness and width and horizontal position are firstly derived from the U-depth map (Fig. 3c), and then the height and vertical position of the obstacle is derived from the depth image directly (Fig. 3b).
Iii-A1 U-depth map
An U-depth map is computed with the column depth value histograms of the original depth image . Fig. 3b shows an onboard depth image when the MAV is facing a human obstacle in a lab space and Fig. 3c is the corresponding generated U-depth map. When an obstacle is in front of the depth camera, the size of the corresponding bin in the U-depth map becomes larger. Based on this property, a bin of the histogram is considered as a point of interest if its value is larger than a threshold defined as
where is the focal length, is a predefined threshold for obstacle’s height in the space and is the corresponding depth of a bin in the histogram.
Those points of interest are then grouped with other candidate points in their neighborhood so that a bounding box can be extracted from the U-depth map, as shown in Fig. 3c.
Iii-A2 Obstacle detection
We represent obstacles as three-dimensional boxes with unknown, yet constant, sizes. Let be the position of the center of a box obstacle, in which the super index indicates the position expressed in the world frame (while indicates in the MAV body (camera) frame) and let be the size (length/thickness, width and height) of the box.
Based on the bounding box found in the U-depth map (see Fig. 3c), which is represented by its top-left and bottom-right corners , we can obtain the obstacle’s horizontal ( and ) position and size (length/thickness and width) in the body frame ,
Then we can further find a corresponding bounding box of the obstacle in the original depth image by grouping depth image points whose horizontal index are within and depth values are within . Let and be the top-left and bottom-right corners of the bounding box. We can derive the obstacle’s vertical () position and height in the body frame,
For a stereo depth camera, the range measurements error generally increases quadratically with the measured depth . In this paper, we adopt an empirically determined detection uncertainty covariance for the obstacle position and for the size. Then the detected obstacle position, size and corresponding uncertainty covariance are transformed into the world frame by considering the MAV’s real-time pose,
where is the rotation matrix of the MAV’s pose, is the size transfomation matrix to compensate its pitch and roll angles, and denote the position and uncertainty covariance respectively.
Iii-B Obstacle Tracking and Prediction
To predict obstacle future positions within the prediction horizon, the detected obstacles in sequential frames are firstly associated by evaluating the Gaussian probability density:
whereand are the obstacle state (position and size) and corresponding uncertainty covariance, the super index indicates the current frame, and are the predicted state and covariance based on previous detection by assuming the obstacle is moving at a constant speed. If the probability density
is larger than a threshold, the two detected objects are determined to be the same moving obstacle whose information is then fed to a Kalman filter.
The Kalman filter estimates the obstacle’s position and its velocity and size. Denote by , and the estimated obstacle position, velocity and size with uncertainty covariance , and at time . Here we omit the super index for simplicity since in the remaining of this paper, all variables are expressed in the world frame.
For collision avoidance of moving obstacles, we predict their future positions and uncertainty covariances using a constant velocity model for obstacle movement. Hence, we have
We assume the size of the obstacle is constant, i.e. , and its uncertainty is not considered in collision avoidance.
Since polygonal obstacles are ill-posed for online constrained optimization, where smooth shapes are preferred to avoid local minima, we enlarge the detected obstacle box using a bounding ellipsoid with semi-major axes proportional to the box dimensions, i.e.
and a rotation matrix indicating the obstacle orientation (yaw) in the world frame.
Iv Robust Collision Avoidance
In this section, we present the robust obstacle avoidance method using chance constrained model predictive control (CC-MPC). The method is based on  which is used for collision avoidance in a controlled environment with an overhead motion capture system. We extend it to an on-board vision based system, by furthering considering the camera’s limited filed of view constraints and yaw control of the MAV.
Iv-a Model Predictive Controller
To formulate the MPC, we first consider the MAV’s dynamics model, described by a stochastic nonlinear discrete-time equation,
where denotes the state of the MAV (position, velocity and orienting) and the control input at time step . and are the admissible state and control space respectively. The initial state is obtained from a state estimator with mean and covariance . denotes the nonlinear dynamics. We consider the MAV’s motion disturbances as Gaussian process noise . See  for details of the dynamics model.
At every time step, for obstacle avoidance, we formulate and solve online a receding horizon constrained optimization problem with time steps and planning horizon , where is the sampling time, as follows,
where denotes the cost term at time and denotes the terminal cost, is a function representing the state constraints as described in detail in section IV-C. is the mean and is the uncertainty covariance of the MAV state at time , where the hat
denotes the mean of a random variable. We further denote bythe covariance matrix of the MAV position , extracted from .
Iv-B Cost Function
We now describe the components of the cost function presented in Eq. (9a).
Iv-B1 Goal navigation
Let be the given goal position of the MAV. We minimize the displacement between its terminal position in the planning horizon and its goal. To this end, we define the terminal cost term,
where is a tuning weight coefficient.
Iv-B2 Control input cost
The second cost term is to minimize the MAV control inputs, designed as a stage cost,
where is a tuning weight coefficient.
Iv-B3 Collision cost
To improve flight safety, we also introduce an obstacle potential field cost based on the logistic function. Denote by the nominal distance between the MAV and obstacle . Then at time stage , the potential field cost corresponding to obstacle is
where is a tuning weight coefficient, is a parameter defining the smoothness of the cost function and is a tuning threshold distance between the MAV and the obstacle where the collision cost is . The reason to use a logistic function is to achieve a smooth and bounded collision cost function.
Iv-B4 MAV yaw control
Since the MAV has a limited field of view, it is generally desirable to make the camera axis, hence the yaw orientation aligned with the direction of motion. Instead of employing a velocity tracking yaw control method as in  which may generate infeasible yaw trajectories, we design a cost function to minimize the deviation between the MAV’s yaw and motion direction,
where is a tuning weight coefficient, indicates the MAV’s motion direction angle. To reduce computation time, we compute based on the MAV’s last-loop planned trajectory.
Finally, the overall stage cost of the formulated MPC is
Iv-C1 Collision chance constraints
For the obstacle, modelled as an ellipsoid, at position with semi-principal axes , the MAV at position with radius is considered to be in collision with it if 
In this paper, we take into account the MAV state estimation uncertainty and obstacle sensing uncertainty. Hence, the collision avoidance constraints would be satisfied in a probabilistic manner, which are formulated as chance constraints [27, 26]
where is the probability threshold for robot-obstacle collision.
Iv-C2 FOV Constraints
To ensure flight safety, the MAV planned trajectory should be within its current limited field of view (FOV) and limited depth sensing range. Given the MAV’s current pose, its three-dimensional FOV with limited depth sensing range can be described by an intersection of five half-spaces,
where and are parameters of the half-spaces. Hence, the FOV constraints are formulated as
Iv-D MAV State Uncertainty Propagation
Evaluating the collision chance constraints in Eq. (16) requires calculating the MAV state, in particular, position uncertainty covariance at each time step. High-precision uncertainty propagation for nonlinear systems, as in Eq. (8) could be very computationally intensive . In this paper, to achieve fast real-time uncertainty propagation, we approximately propagate the MAV state uncertainty using an Extended Kalman Filter (EKF) based update, i.e.
where is the process noise accounting for motion disturbances, is the state transition matrix of the MAV. Then the position uncertainty covariance can be extracted from . Note that in the above equation, the computation of correlates the robot state and control inputs, which will introduce additional variables into the optimization problem Eq. (9) and increases the computation time greatly. To this end, we propagate the MAV state uncertainty based on its last-loop trajectory and control inputs before solving this-loop optimization problem.
In this section, we describe our implementation of the proposed approach and evaluate it in real-world experiments. A video showing the flight test results accompanies this paper can be found at https://youtu.be/nZaR-8Z515s.
V-a Implementation and Hardware
Our experimental platform is the Parrot Bebop 2 quadrotor111https://www.parrot.com/us/drones/parrot-bebop-2 mounted with an NVIDIA Jetson TX2 Module222https://developer.nvidia.com/embedded/jetson-tx2 and an Intel RealSense Depth Camera D435i333https://www.intelrealsense.com/depth-camera-d435i, as shown in Fig. 4. The Parrot Bebop 2 allows for executing control commands sent via ROS444https://bebop-autonomy.readthedocs.io. The D435i camera is dually used for visual-inertial odometry and depth image sensing, which has a FOV and 5 m depth sensing range. The TX2 is used to perform all on-board computation and is connected with the Bebop 2 via WiFi.
We use a filtering-based stereo visual-inertial odometry algorithm, the S-MSCKF , for state estimation of the MAV, which runs at 15 Hz. The camera depth images are received at 60 Hz and the obstacle detection and tracking is running at frame rate. We rely on the ACADO toolkit  to generate a fast C solver for our MPC, in which a sampling time of 60 ms is used and the prediction horizon is set to 1.5 s. The radius of the MAV is set as 0.4 m. The two closest detected obstacles are fed to the MPC for collision avoidance. The collision probability threshold is set as . In order to obtain some quantitative results, in the lab scenarios we use an external motion capture system (OptiTrack) to measure the position of the MAV and moving obstacles, which is only used as ground true data.
V-B Obstacle Detection and Tracking Performance
We first evaluate the obstacle detection and tracking performance for moving obstacles. Fig. 1(a) shows a lab scene with two walking human obstacles. We put the camera at the origin of the world frame and recorded a dataset of which the two humans were walking around at a speed of approximately 1.2 m/s. Position and velocity of the two human obstacles are obtained using our obstacle detection and tracking algorithm. Table I shows the average position and velocity estimation errors of the two moving obstacles comparing with ground truth measurements. It can be observed that the average position estimation error is around 0.3 m and that of velocity can be up to 0.5 m/s, which indicates the obstacle sensing uncertainty should be taken into account when planning robust collision-free trajectories for the MAV. In practice, the obstacle’s velocity estimation may be very noisy and has a very large uncertainty covariance. In this case we bound the in Eq. (6) when predicting the obstacle’s future positions and corresponding uncertainty covariances.
|Moving obstacle||Average estimation error|
|Position (m)||Velocity (m/s)|
V-C Obstacle Avoidance in Dynamic Environments
We tested the system in a variety of flight tests. The results of two typical scenarios are particularly presented here.
V-C1 Scenario 1 (Flying in a confined lab space)
The MAV is required to navigate from a start point to an end point while avoiding two walking humans. Fig. 5 shows a series of snapshots and the MAV on-board camera grayscale images taken during the experiment. In this scenario, we performed the experiment five times. Fig. 5(a) shows the measured distance between the MAV and the two moving human obstacles over time in the five experiments. The distance is computed, based on ground truth measurements, as the closest distance between the MAV’s position and the obstacle ellipsoid’s surface (with semi-major axis ). In Fig. 5(b), we cumulate all the distance data. It can be observed that in all instances a minimum safe separation of 0.4 m was achieved and therefore collisions with the humans were avoided. A maximal speed of around 1.6 m/s of the MAV was observed in this experiment.
The boxplots of the on-board runtimes in this scenario is shown in Fig. 5(c). For the runtimes of the obstacle detection and tracking, the 75 percentile is always below 8 ms, which is fast enough to be run at frame rate (60 Hz). For the runtimes of the MPC framework, the 75 percentile is always below 22 ms, indicating the framework can be run efficiently in real time.
V-C2 Scenario 2 (Flying in a long corridor)
The MAV is flying in a long narrow corridor where there are both static and moving obstacles. Fig. 7 shows a snapshot taken during the experiment. A maximum speed of around 2.4 m/s was achieved by the MAV in the experiment. Detailed results of the experiment can be found in the video accompanying this paper.
We presented an on-board vision-based obstacle avoidance approach for MAVs navigating in dynamic environments. Flight test results in a variety of environments with moving obstacles demonstrated the effectiveness of the proposed approach. We adopted a fast three-dimensional obstacle detection and tracking algorithm based on depth images which can run at a frame rate of 60 Hz. We took into account the obstacle sensing uncertainty by using a chance constrained model predictive controller (CC-MPC) to generate robust local collision-free trajectories for the MAV. We implemented the approach on a computational power-limited quadrotor platform, where the obstacle detection and tracking has a mean computation time of 8 ms and that of the MPC is 16 ms. In real-world indoor experiments, the MAV is shown to be able to avoid walking human obstacles at a maximum speed of 2.4 m/s.
-  (2009) Vision-based guidance and control of a hovering vehicle in unknown, GPS-denied environments. In 2009 IEEE International Conference on Robotics and Automation (ICRA), pp. 2643–2648. External Links: Cited by: §I-A.
-  (2012) Depth camera based indoor mobile robot localization and navigation. In 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1697–1702. External Links: Cited by: §I-A.
-  (2018) A survey on aerial swarm robotics. IEEE Transactions on Robotics 34, pp. 837–855. External Links: Cited by: §I.
-  (2017) Rapid exploration with multi-rotors: A frontier selection method for high speed flight. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2135–2142. External Links: Cited by: §IV-B4.
-  (2019) How fast is too fast? the role of perception latency in high-speed sense and avoid. IEEE Robotics and Automation Letters 4 (2), pp. 1884–1891. Cited by: §I.
-  (2018) NanoMap: Fast, uncertainty-aware proximity queries with lazy search over local 3d data. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7631–7638. External Links: Cited by: §I-A.
-  (2012) Vision-based autonomous mapping and exploration using a quadrotor MAV. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4557–4564. External Links: Cited by: §I-A.
-  (2019) Flying on point clouds: online trajectory generation and autonomous navigation for quadrotors in cluttered environments. Journal of Field Robotics 36 (4), pp. 710–733. Cited by: §I.
-  (2018) Online safe trajectory generation for quadrotors using fast marching method and bernstein basis polynomial. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 344–351. External Links: Cited by: §I-A.
-  (2019) FIESTA: Fast incremental euclidean distance fields for online motion planning of aerial robots. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4423–4430. Cited by: §I-A.
-  (2013) OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots 34, pp. 189–206. External Links: Cited by: §I-A.
ACADO toolkit-An open-source framework for automatic control and dynamic optimization. Optimal Control Applications and Methods 32, pp. 298–312. External Links: Cited by: §V-A.
-  (2017) Planning dynamically feasible trajectories for quadrotors using safe flight corridors in 3-d complex environments. IEEE Robotics and Automation Letters 2, pp. 1688–1695. External Links: Cited by: §I-A.
-  (2017) Aggressive 3-d collision avoidance for high-speed navigation. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5759–5765. Cited by: §I-A.
-  (2017) A review of uncertainty propagation in orbital mechanics. Progress in Aerospace Sciences 89, pp. 23–39. External Links: Cited by: §IV-D.
-  (2017) Funnel libraries for real-time robust feedback motion planning. International Journal of Robotics Research 36, pp. 947–982. External Links: Cited by: §I-A.
-  (2013) First results in detecting and avoiding frontal obstacles from a monocular camera for micro unmanned aerial vehicles. In 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 1750–1757. External Links: Cited by: §I-A.
-  (2015) Reactive avoidance using embedded stereo vision for MAV flight. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 50–56. External Links: Cited by: §I-A, §I, §III-A1, §III-A2, §III.
-  (2017) Voxblox: Incremental 3d euclidean signed distance fields for on-board mav planning. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1366–1373. External Links: Cited by: §I-A.
-  (2016) Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments. In Robotics Research, Vol. 114, pp. 649–666. External Links: Cited by: §I-A.
-  (2014) Robust and efficient volumetric occupancy mapping with an application to stereo vision. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 6102–6107. External Links: Cited by: §III-A2.
-  (2017) Robust stereo visual inertial odometry for fast autonomous flight. IEEE Robotics and Automation Letters 3, pp. 965–972. External Links: Cited by: §II, §V-A.
-  (2018) Autonomous flight. Annual Review of Control, Robotics, and Autonomous Systems 1, pp. 29–52. External Links: Cited by: §I-A.
-  (2019) FASTER: fast and safe trajectory planner for flights in unknown environments. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1934–1940. External Links: Cited by: §I.
-  (1998) VFH+: Reliable obstacle avoidance for fast mobile robots. In 1998 IEEE International Conference on Robotics and Automation (ICRA), pp. 1572–1577. External Links: Cited by: §I-A.
-  (2019) B-uavc: buffered uncertainty-aware voronoi cells for probabilistic multi-robot collision avoidance. In 2019 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp. 162–168. Cited by: §IV-C1.
-  (2019) Chance-constrained collision avoidance for mavs in dynamic environments. IEEE Robotics and Automation Letters 4 (2), pp. 776–783. External Links: Cited by: §I, §I, §II, §IV-A, §IV-C1, §IV-C1, §IV-C1, §IV.
-  (2019) Distributed multi-robot formation splitting and merging in dynamic environments. In 2019 International Conference on Robotics and Automation (ICRA), pp. 9080–9086. Cited by: §I.