Risk Assessment and Planning with Bidirectional Reachability for Autonomous Driving

by   Ming-Yuan Yu, et al.
University of Michigan

Knowing and predicting dangerous factors within a scene are two key components during autonomous driving, especially in a crowded urban environment. To navigate safely in environments, risk assessment is needed to quantify and associate the risk of taking a certain action. Risk assessment and planning is usually done by first tracking and predicting trajectories of other agents, such as vehicles and pedestrians, and then choosing an action to avoid collision in the future. However, few existing risk assessment algorithms handle occlusion and other sensory limitations effectively. This paper explores the possibility of efficient risk assessment under occlusion via both forward and backward reachability. The proposed algorithm can not only identify where the risk-induced factors are, but also be used for motion planning by executing low-level commands, such as throttle. The proposed method is evaluated on various four-way highly occluded intersections with up to five other vehicles in the scene. Compared with other risk assessment algorithms, the proposed method shows better efficiency, meaning that the ego vehicle reaches the goal at a higher speed. In addition, it also lowers the median collision rate by 7.5x.



page 1

page 3

page 6


Occlusion-Aware Risk Assessment for Autonomous Driving in Urban Environments

Navigating safely in urban environments remains a challenging problem fo...

Pedestrian Emergence Estimation and Occlusion-Aware Risk Assessment for Urban Autonomous Driving

Avoiding unseen or partially occluded vulnerable road users (VRUs) is a ...

Risk Measurement, Risk Entropy, and Autonomous Driving Risk Modeling

It has been for a long time to use big data of autonomous vehicles for p...

Prediction-Based Reachability Analysis for Collision Risk Assessment on Highways

Real-time safety systems are crucial components of intelligent vehicles....

Deep Predictive Models for Collision Risk Assessment in Autonomous Driving

In this paper, we investigate a predictive approach for collision risk a...

Interpretable Goal Recognition in the Presence of Occluded Factors for Autonomous Vehicles

Recognising the goals or intentions of observed vehicles is a key step t...

Dynamic Lambda-Field: A Counterpart of the Bayesian Occupancy Grid for Risk Assessment in Dynamic Environments

In the context of autonomous vehicles, one of the most crucial tasks is ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Most recent autonomous vehicles are equipped with a full sensor suit, including radars, cameras, and LIDAR [kato2015open]. These sensors give the vehicle the ability to perceive the world and to assess the risk of taking a certain action. The quantified risk can then be passed to a planning algorithm to eventually take a action. However, it is still challenging for an autonomous vehicle to navigate through complex environments such as occluded urban intersections such as the one shown in Figure 1.

The primary challenge of safely navigation through these occluded scenarios is predicting the behavior of all possible incoming traffic. The challenge arises due to the size of the state space of all possible incoming traffic from every occluded region which can make computation intractable.

Traditionally only forward reachability is used for risk assessment. In other words, algorithms make predictions of all other agents’ future trajectories, and select an optimal action based on collision probability

[lee2017collision] and time-to-collision [ammoun2009real].

Fig. 1: The ego vehicle (blue box) intends to perform an unprotected left turn to the goal (yellow star) at an intersection. The green shaded region is the observable polygon and arises due to the 1) limited sensor range and 2) occlusions from another vehicle (red box) and buildings (gray regions.) The proposed algorithm quantifies and identifies the location of risk posed by other vehicles (red particles). Our approach identifies and focuses its attention on risk-inducing regions in the near future (e.g. 1.5 seconds) to ensure that it does not waste computational resources predicting everything in the occluded region. Both axes are in meters.

In contrast, human drivers can usually handle scenarios like the one in Figure 1

even with relatively limited sensing capabilities, such as a narrow field-of-view. We argue that this is because human drivers sense the world in a more efficient way: using semantic and prior information to focus sensing resources on portions of the scene that are most pertinent. For instance, when driving normally, drivers focus mostly on the current and future location of the ego vehicle. We only glimpse to check the rear view mirror when driving forward, but never focus our attention on it for an extended period of time. This is due to the fact that we have 1) a semantic understanding of the environment (e.g. road layout, speed limit, etc) and 2) prior knowledge such as the ability to estimate the reachability of not only for the ego vehicle but also other agents. As a result, we know to focus our limited sensing resources on only the regions where the ego vehicle can reach, and properly assess the risk to take an action.

This paper proposes an algorithm that leverages this intuition from human driving for probabilistic risk assessment and planning by utilizing not only forward but also backward reachability. Since collisions can only happen within the FRS (FRS) of a vehicle, it is unnecessary to predict everything in the occluded region. Instead, staring from a location in the FRS of the ego vehicle, one can backward propagate the motion of other vehicles and eventually get all possible initial conditions of risky agents. These initial conditions quantifies and spatially identifies risk-inducing regions and agents, as shown in Figure 1. With this information, we show how this quantified risk can be used for motion planning for safe navigation in urban environments.

This paper is organized as follows. Section II reviews related work in reachability-based planning and risk assessment under occlusion. Section III shows our problem formulation and proposed risk assessment and motion planning algorithm based on forward and backward reachability. Section IV

describes two baselines and our evaluation metrics. Section 

V describes the quantitative evaluation of the proposed algorithm and illustrates that the proposed method is a significant improvement over existing methods. Section VI concludes and discusses future directions of this work.

Ii Related Work

Risk assessment is the process of predicting and quantifying the risk of taking a certain action. The proposed method belongs to the category of methods that quantify the risk of having a collision with another agent in the scene as defined in the well-organized survey by lefevre2014survey.

Reachability-based motion planning is widely used for autonomous driving [manzinger2018tactical, orzechowski2018tackling, koschi2017spot, ahn2018reachability, vaskov2019guaranteed, vaskov2019towards, bajcsy2019efficient]. manzinger2018tactical assume cooperative agents and proposed an algorithm to negotiate conflicting motions. They try to solve a winner determination problem, which is NP-hard, by restricting the set of combinations to a tree structure. This makes the original problem computationally tractable. orzechowski2018tackling over-approximate the FRS of incoming traffic, including those who are occluded, by only considering the leading edge of the observable polygon and make sure that the ego vehicle does not collide with the FRS. koschi2017spot use forward reachability to predict future both the FRS of the ego vehicle and the ones of other traffic participants. It performs collision checking of the FRS and is evaluated on both urban intersection and highway scenarios. ahn2018reachability use both forward and backward reachability, just like the proposed method, to formulate a discrete decision making problem for autonomous driving. vaskov2019guaranteed propose RTD (RTD) algorithm for real-time trajectory planning in a static scene by showing how to utilize a FRS computed offline during safe online motion planning. vaskov2019towards later extended the idea [vaskov2019guaranteed] to scenes with dynamics agents.

Our approach differs in the following ways. First of all, we formulate the problem probabilistically, whereas the approaches developed by orzechowski2018tackling,ahn2018reachability,koschi2017spot,vaskov2019guaranteed,vaskov2019towards do not. The proposed algorithm captures the distribution of risk in both the action space and Cartesian space thus can be more flexible and gives one to potentially adjust the balance between aggressiveness and effectiveness. Secondly, the work by ahn2018reachability is intended to be used as a high-level planner for switching between behaviors and thus relies on a low-level planner to generate throttle and steering commands. By contrast, our approach is more like a controller and plans low-level commands directly. Finally, methods by ahn2018reachability,manzinger2018tactical,vaskov2019guaranteed,vaskov2019towards do not consider occlusions or other sensory limitations, whereas our approach is designed to operate even under occlusions.

The algorithm proposed in this paper builds upon our previous work [yu2019occlusion], which is also a risk assessment and motion planning algorithm. One weakness of our prior work is its poor scalability with respect to the size of the environment. The computational cost is proportional to the area of the occluded regions within the size of the map. The cost of the new approach proposed in this paper does not scale with the area of occluded regions and has a lower collision rate. Also, our previous approach does not identify where to focus the sensory resource.

The main contributions of this paper are as follows:

  • We propose a probabilistic risk assessment method which identifies the location of the risk-inducing agents and regions.

  • The proposed method utilizes both forward and backward reachability for efficient risk assessment.

  • We show how the quantified risk can be used for motion planning for navigating safely in urban environments.

  • We evaluate our method on real-world intersections and show quantitative improvement in terms of reduced collision rate when compared to the state of the art.

Iii Method

Fig. 2: Step-by-step illustration of different stages of the proposed algorithm. (a) An intersection with occlusion and with one other agent (red box) in the scene. The observable polygon is indicated by the green shaded region, and the goal location is indicated by the yellow star. (b) State 1: calculating FRS. The dots are projections of the particles in 2D. (c) Stage 2: calculating BRS. The dots are projections of the particles in 2D. (d) Stage 3: filtering with the observable polygon. The remaining red dots in the figure indicates spatially the source of risk-inducting regions.

We first define both forward and backward reachability in Section III-A. In Section III-B, we show how to use aforementioned reachability for efficient risk assessment. Finally, in Section III-C, we demonstrate using the quantified risk for motion planning.

Iii-a Bidirectional Reachability

Let be time, be the state space and be the action space. Let be the state and be the control input of a dynamic system at time . Assume the vehicle’s dynamics are described as an ODE (ODE):


where is a Lipschitz continuous function.

The FRS describes a set of all possible states of a system in the future. Given the dynamics in (1), an initial set and a fixed time horizon , the FRS is defined as


On the other hand, the BRS (BRS) describes all possible initial states that are able to reach a target set. Given the dynamics in (1), a target set and a fixed time horizon , the BRS is defined as


Normally, reachable sets are calculated for each observed agent [vaskov2019guaranteed, vaskov2019towards]. These methods usually perform well when only a limited number of agents are presented in a non-occluded scene. Scenes like urban intersections can still be challenging for these method since predicting everything, especially in a occluded environment, can computationally intensive in many cases. Efforts have been made to ease the calculation under occlusion [orzechowski2018tackling, yu2019occlusion], but the complexity still grows along with either the number of occluded regions [orzechowski2018tackling] or the total area of the occluded regions [yu2019occlusion]. For this reason, in Section III-B we introduce multiple stages of our algorithm that utilize both forward and backward reachability for efficient risk assessment under occlusion. The complexity of proposed method stays constant no matter how the number or size of the occluded regions grow.

Iii-B Risk Assessment

For autonomous driving, collisions happen only at the intersections of the FRS of different agents. In other words, if another agent (occluded or not) cannot reach where the ego vehicle can possibly be in the future, seconds for example, it does not induce risk to the ego vehicle. This observation leads to an important insight – it is unnecessary to predict the behavior of everything in a scene. Instead, we can focus resources on identifying only those agents who pose a risk to the ego vehicle.

The proposed algorithm, whose behavior is depicted in Figure 2, is performed iteratively every seconds, and has a forecast horizon of seconds. It can be broken down into three stages.

Iii-B1 Stage 1 – Frs of the Ego Vehicle

In the first stage, the FRS of the ego vehicle is calculated. In particular, we borrow the idea of using particles from our previous work [yu2019occlusion, jacobs2017real] to represent the FRS since it makes the FRS probabilistic and the calculation parallelizable.

Without loss of generality, let at the beginning of each iteration. Let the control inputs be parameterized by some parameters and such that


where the subscripts and indicate quantities tailored to the ego vehicle and other vehicles, respectively. Let the th particle, , be defined as


where the superscript indicates th sample of a variable, is ego vehicle’s initial state, is a fixed time horizon. and is the total number of particles.

In each iteration, we first sample particles from the following distributions:


where is the initial set of the ego vehicle, is a closed interval,

is an uniform distribution over a set, and

is the transition probability of a quantity between subsequent iterations. The transition probabilities enforces smoothness in the action space which reduces jittery motion. Since the smoothness is enforced on the distribution level but not directly on the optimal value, actions such as emergency breaking are still allowed. More details about determining the optimal control input from the distribution are described in Section III-C.

Now the probabilistic FRS of the ego vehicle can be re-written as


The set FRS can be easily calculated by solving IVP in parallel with a regular ODE solver. Note that FRS contains hypotheses of the future states of the ego vehicle, as shown in Figure (b)b.

Iii-B2 Stage 2 – Brs of Other Agents

In the second stage of the proposed algorithm, we calculate the BRS of other agents in a similar fashion by setting the target set to FRS such that


where and are invertible functions projecting the state spaces into Cartesian space. We use projections since collisions occur when a subset of the states (e.g. positions) overlap. In this case, collision happens at one point in the Cartesian space, while other states such as heading and velocity do not necessary have to be the same and should be assigned separately.

Unlike the FRS, the BRS set cannot be calculated directly with an ODE solver. Instead, we need to first reformulate the final value problem into an equivalent IVP. For Lipschitz continuous dynamic systems such as CTRV (CTRV) model, CTRA (CTRA) model, and the bicycle model, the initial value corresponding to particle can be obtained by solving the following IVP:


The set BRS can now be obtained by solving (9) in parallel with a regular ODE solver, and the solution is in (8), but reversed in time.

BRS represents a collection of risky initial conditions of other agents. Each particle corresponds to a possible collision between the ego vehicle and another agent by design, since the target set of BRS is chosen to be FRS. By doing so, this approach can handle occlusion in an efficient way. We no longer need to predict all possible agents in the scene. Instead, the set BRS originates only from and tells us directly where the potentially dangerous agents are, no matter whether they are occluded or not. This step is depicted in Figure (c)c.

Iii-B3 Stage 3 – Consistency with the Observation

BRS alone does not quantify the risk associated with a particle. We need to incorporate sensory systems, such cameras, radars and LIDAR, to identify risky control inputs. A combination of the aforementioned sensors generates a free space around the ego vehicle know as the observable polygon, , as shown in Figure (d)d.

If the entire trajectory of th particle, , ends in , it is considered to be collision-free; otherwise, it is potentially risky. Quantitatively speaking, a weight is assigned to each particle based on its entire trajectory.


where is an indicator function. As a results, the weight renders the likelihood of the corresponding action being collision-free or not.

Iii-C Motion Planning

Based on the weight from Section III-B it is possible to generate a set of safe actions by resampling. Resampling the particles with defined in (10) filters out actions with potential collision. However, it is insufficient to use just during resampling. For example, in many cases the safest action may be stopping completely before reaching the goal or getting up to the target speed. The ego vehicle can freeze and never reaches the goal. Hence we need another factor to promote forward motion of the ego vehicle.

Let be the desired control input when no risk is presented in the scene. The desired control input can be from a predefined behavior, such as slowing down gently when approaching an intersection, or maintaining the target speed, etc. Assume that is given, we assign another weight to each particle.




is a hyperparameter for scaling. The weight

promotes those control inputs which are closer to the desired one, and thus is able to drive the ego vehicle forward.

A naive way to utilize both for safety and for driving is to interpret them as likelihoods and take their product as the final weight, and resample the particles. However, safe actions might be suppressed when they are no where near the desired action. So taking the direct product may end up with an action that is still too risky.

Instead, we calculate the final weight in the following way:


where is a small number. We set it to in this paper.

To understand (12) intuitively, let us first consider an extreme case. Assume , indicating that all the trajectories are in the observable polygon ans thus being collision-free. The weight becomes proportional to , so the actions closer to the desired action will have higher final weights. In this case, resampling with makes the ego vehicle move according to the desired action.

On the other hand, if any of the trajectories shows slight risk (e.g. ), the final weight will be approximately proportional to . As a result, safer actions get promoted and the value has negligible effect in this case. In short, using (12) with a reasonably small forces the ego vehicle to prioritize safety and to move forward only when it is risk-free. Resampling with the weight from (12) gives us a distribution represented by a set of ”good” particles. The final step is to select an optimal control input from this distribution for the ego vehicle to execute.

In general, the distribution can be multi-modal, meaning that there may be multiple control inputs that are equally good. For instance, if there are two vehicles approaching an intersection at the same time, two possibly equally good actions for one of the vehicles can be either 1) being aggressive – accelerating and pass first or 2) being conservative – waiting until the other one passes. For this reason, we make use of a well-developed clustering algorithm from the machine learning and computer vision community –

DBSCAN (DBSCAN[ester1996density] – to identify multi-modal distributions and select the optimal action.

DBSCAN clusters particles in the action space based on the local density without specifying the number of clusters a priori. It identifies the number of cluster automatically, and the centroids of clusters are candidates for the optimal control input. We choose to prioritize safety and make the ego vehicle to be more conservative in this paper. So if there are multiple clusters, we pick the one with the most deceleration. The optimal control input is then executed until the next iteration.

Iv Evaluation

This section introduces the baselines for the proposed algorithm to be compared against. We also define the metrics for evaluation.

Iv-a Model

Section III-B only gives a general description of the pipeline for clarity, but to test and implement the proposed algorithm, we need to pick appropriate models for both the ego vehicle and other vehicles. Due to its simplicity and tractable dimension, we choose to use the train-like model, meaning that all the agents travels in their own curvilinear coordinates. A concrete example of the model is to have a vehicle traveling on a lane without lane changing. Although we assume no lane changing in this work, it is possible to relax this constraint by using physics-based model like the bicycle model or other behaviour-based models.

By working in curvilinear coordinates, the location of the ego vehicle can be parameterized by only two numbers: longitudinal position along the center line and the lateral offset with respect to the center line. In addition, the heading is completely determined by thus is not necessary to be included as a state. The dynamics of the ego vehicle can be written as


where is a constant acceleration along the center line. The dynamics of other vehicles is written in a similar way, but assumed to have a constant speed such that


If there are multiple valid lanes for non-ego vehicles, each lane should have its own dynamics.

Here we want to emphasize that more sophisticate models can be substituted in directly, but at the cost of higher computational complexity. We choose constant speed/acceleration models because their simplicity and reasonable performance in similar scenarios, as illustrated in our previous work [yu2019occlusion]. Moreover, since we work in curvilinear coordinates, both models, (13) and (14), can be approximated by CTRV and CTRA models, respectively, when the replanning time is small. Both CTRV and CTRA are shown to have comparable performance in prediction in urban environments [schubert2008comparison].

The desired action is designed to have a constant deceleration when approaching the stopline, and to switch to a saturating proportional controller to track a target speed.

Iv-B Baseline 1

The first baseline is an occlusion-aware algorithm by orzechowski2018tackling. We recreate and simulate two of the scenarios demonstrated in their paper. The map is a four-way unsignalized intersection partially occluded by a non-transparent static object.

The first scenario consists of no other agent, and the second one has one other agent coming from the left hand side of the ego vehicle. In both scenarios, the goal is to navigate safely through the intersection and to reach a target speed of . The geometries of the map are measured using AutoCAD, and the velocity of the other agent is estimated afterwards.

Iv-C Baseline 2

The second baseline is another occlusion-aware method from our previous work [yu2019occlusion]. We simulate the same set of random scenarios with up to five other agents for each scene on both the proposed method and the second baseline. Each scene is an four-way unsignalized intersection and the geometries are extracted from OSM (OSM). There are real-world intersections in total. In addition, there is also a synthetic layout where the roads are straight and perpendicular to each other.

These maps have significantly more severe occlusion comparing to those in the first baseline [orzechowski2018tackling]. Buildings are added to the map with a two-meter buffer around the driving surface, as shown in Figure 1 and 2. Speed of a non-ego agent is randomly set to a constant between to . For simplicity, we assume no collision between non-ego agents and they do not interact with other agents.

(d) Baseline 1 [orzechowski2018tackling]
(e) Proposed
(i) Baseline 1 [orzechowski2018tackling]
(j) Proposed
Fig. 3: Two scenarios in the first baseline [orzechowski2018tackling]. The first scenario (top row) is in a map with one static obstacle (gray) and no incoming traffic, and the second scenario (bottom row) is in the same map with one other vehicle (red) coming from the left. The first three columns (a-c, f-h) show key frames of the proposed method, and the other two columns (d,e,i,j) show the speed and acceleration profiles. In the first scenario, the baseline abruptly breaks at whereas the proposed method maintains a near constant deceleration, as shown in (d) and (e). In the second scenario, the baseline breaks at both and , whereas the proposed method only breaks once at , as shown in (i) and (j). In addition our method does not stop completely in the second scenario. Instead, it creeps forward between and to gather more information. This behavior results in a higher final speed and thus makes the ego vehicle faster to reach the goal. The fourth column is adopted directly from Figure 4 in [orzechowski2018tackling].

Iv-D Metrics

For the first baseline, we investigate the speed and acceleration profiles for both ride comfort and efficiency. Ride comfort is evaluated by the number of abrupt breaking and the maximum deceleration, whereas efficiency is indicated by the terminal speed of the ego vehicle.

Collision rate is evaluated for the second baseline on one synthetic and all the real-world intersections. This results in simulations in total for each method.

V Results

V-a Baseline 1

Figure 3 shows the behaviors under two scenarios recreated from the first baseline [yu2019occlusion]. Both the first baseline and the proposed method enable the ego vehicle to reach to the goal without collision in either scenario. The ego vehicle slows down when approaching the intersection, no matter if there is any incoming traffic.

However, they have three key differences. To begin with, in the first scenario the baseline [orzechowski2018tackling] has a sudden break at , while the proposed method stays roughly at a constant deceleration. In addition, the maximum breaking (i.e. deceleration) in the second scenario is lower in the proposed method. The reduced breaking and jerk indicate higher ride comfort. Finally, in the second scenario the ego vehicle creeps forward between and in order to get a clearer view, as shown in Figure (h)h and (j)j. Once it gets enough information and thinks it is risk-free, it then proceeds. This behavior leads to a higher efficiency, meaning that it obtains a higher speed at the end of the simulation all without sacrificing safety.

V-B Baseline 2

median collision rate # of zero-collision intersections
Baseline 2 [yu2019occlusion] out of
Proposed out of
TABLE I: Simulation results at intersections.

Among the intersections, the proposed method obtains a median collision rate of , which is a significant reduction compared to the of the second baseline [yu2019occlusion], as shown in Table I. Moreover, the proposed method achieves zero collision at not only the synthetic layout, but also real-world layouts. In contrast, the baseline [yu2019occlusion] has a minimum collision rate of at one of the intersections.

Vi Conclusions and Future Work

We propose a probabilistic risk assessment method for motion planning under occlusion in urban intersections. The proposed algorithm is able to quantify the distribution of risk in the action space, and can be used to generate low-level control inputs for an autonomous vehicle to navigate safely. It also identifies spatially where the risk-inducing regions are via both forward and backward reachability. The proposed method is compared quantitatively to previous work and shows significant improvement in terms of both efficiency and safety.

Future work include using the proposed method to redirect and focus sensory resources to further reduce impact on limited sensing, formulating and solving the problem analytically without sampling to obtain safety guarantees, and deploying it to hardware platforms for testing in the real-world.