Generating Comfortable, Safe and Comprehensible Trajectories for Automated Vehicles in Mixed Traffic

05/14/2018 ∙ by Maximilian Naumann, et al. ∙ FZI Forschungszentrum Informatik KIT 0

While motion planning approaches for automated driving often focus on safety and mathematical optimality with respect to technical parameters, they barely consider convenience, perceived safety for the passenger and comprehensibility for other traffic participants. For automated driving in mixed traffic, however, this is key to reach public acceptance. In this paper, we revise the problem statement of motion planning in mixed traffic: Instead of largely simplifying the motion planning problem to a convex optimization problem, we keep a more complex probabilistic multi agent model and strive for a near optimal solution. We assume cooperation of other traffic participants, yet being aware of violations of this assumption. This approach yields solutions that are provably safe in all situations, and convenient and comprehensible in situations that are also unambiguous for humans. Thus, it outperforms existing approaches in mixed traffic scenarios, as we show in our simulation environment.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In the last decades, tremendous progress has been achieved in the field of automated driving [1]. First experiments with close-to-production cars in real traffic have succeeded and led to more and more attention in public [2]. Recently, some companies even announced the large-scale deployment of hundreds of fully automated vehicles in public road traffic, i.e. vehicles that fulfill SAE level 5 [3].

Obviously, vehicles must operate safely under all encountered circumstances to be accepted by the public. Shalev-Shwartz et al. motivate that the fatality rate should be reduced by three orders of magnitude, i.e. from to [4]. However, it is also obvious that this safety improvement must not result in a vast lack of comfort and utility. Simply consider the advantage of railways over cars in safety: Although trains are safer by a factor of 50111Considering fatalities per passenger kilometer in Germany., many people still use their car [5]. Thus, one can assume, that people will only use automated cars, if they improve safety without loosing too much comfort or utility.

In order to operate safely, an automated vehicle undoubtably needs a comprehensive, redundant perception. Further, it needs a motion planning module that is able to guarantee safety under some constraints concerning the perception accuracy and the behavior of other traffic participants. In order to also operate comfortably, the vehicle needs to analyze the situation and predict its future evolution. The latter is often provided by a module called situation prediction, which predicts the motion of other traffic participants and provides it to the motion planning module. In the motion planning, they are treated as obstacles that are to be avoided. This hierarchical design and the treatment as obstacles, however, does not account for the fact that the motion of the ego vehicle potentially affects the motion of other traffic participants (cf. Fig. 1).

(a) Scenario: Narrowing
(b) Classical Approach
(c) Cooperative Approach
Figure 1: Path-time-diagrams of the plan of the ego vehicle (black). Classical approaches treat other traffic participants as obstacles: Driving second (1) through the potential collision zone (grey) appears to be the solution with the least costs. The proposed approach treats other traffic participants as cooperative agents: Driving first (1) is globally optimal, also for different driver types (2), (3) and we have a comfortable fallback option (4).

In this work, we thoroughly review the problem statement of behavior generation and motion planning for automated vehicles. We pay special attention to the inevitable uncertainties that motion planning approaches have to deal with: the uncertainty in the perception and in the prediction of other traffic participants. With these considerations, as an extension to our previous work [6], we propose an integrated approach to generate comfortable, safe and comprehensible trajectories for automated vehicles in mixed traffic: We enhance our cost functional for trajectory ensembles, consisting of one trajectory per vehicle. Thus, we acknowledge the fact that not only the behavior of other traffic participants affects us, but also our behavior affects the others in a closed loop. The remainder of this paper is structured as follows: The problem statement and related work concerning optimization, uncertainty consideration and safety guarantees are presented in the following section. Subsequently, our approach is presented in Section III. It is evaluated in Section IV, using our simulation framework CoInCar-Sim [7].

Ii Problem Statement and Related Work

Presumably, the goal of every traffic participant is to travel to a certain destination in a convenient way. Having this in mind, we deduce the requirements considering optimality, uncertainty and safety. From these considerations, we derive guidelines for a convenient driving policy.

Ii-a Challenges

The task of finding an optimal plan from a start state to a desired goal state is called motion planning. The common approach to motion planning is the formulation of an optimization problem: A certain cost functional for state transitions is defined, along with some constraints. The plan with the lowest cost that does not violate any constraint is determined by the minimization of the cost functional. Analogously, a reward functional can be maximized.

Considering automated driving, the motion planning task is challenging due to several circumstances:

  1. cost formulation: comfort and perceived safety is difficult to express numerically

  2. uncertain perception: the perception of the status quo is subject to uncertainties

  3. uncertain prediction: the evolution of the scene is unknown and it depends on other agents’ and the own plan

Ii-B Notation

In the remainder of this paper, we employ the following notation:

Considering the time , we refer to

  • as the start time of the decision problem

  • as the time that actions (introduced below) are carried out in

  • as the planning horizon

  • as the (re)planning points in time

  • as the number of actions within the planning horizon.

Considering the decision problem itself, we refer to

  • as the state vector, containing a finite number of state variables

    describing the current scene

  • as the action vector, containing a finite number of actions that are performed and that influence the evolution of the state vector over time, from to

  • as the transition probability, describing the probability that actions

    at time in state will lead to state at time

    . In case of deterministic transitions, the probability density function

    turns into an ordinary function with . Either formulation relies on the Markov assumption, that the effects of an action taken in a state depend only on that state and not on its prior history.

  • as the cost function, describing the cost of a state transition from to taking action

In order to consider the problem as a classical Markov Decision Process (MDP), one must be able to solely decide which action is taken. Actions performed by other traffic participants would not be modeled explicitly. In the following, we distinguish between scalar for the action of one vehicle and for the actions of multiple vehicles. In case of a classical MDP, the actions of other traffic participants would be covered in the transition probability . The optimal policy can then be calculated for example via minimizing

  • the expected finite horizon costs


    where final costs can be added to account for the time not covered by the horizon,

  • or the expected cumulative discounted costs


    with discount factor .

However, in road traffic, every traffic participant decides over its future state by its own action222Assuming collision free motion of every traffic participant.. If we know the current state including all objects and the environmental condition, and the actions of all objects , the transition to can be considered deterministic, except for the environmental condition. The change of environmental conditions can be either neglected or considered as known throughout the decision problem, as the traffic participants do not influence it.

Thus, we model this problem as a Multi Agent Markov Decision Process (MMDP) with deterministic transitions: We choose a subset of the state vector, that defines the state of the ego vehicle , one that defines the state of other traffic participants and one that describes the environment . Further, the action vector contains one action per traffic participant . However, we can solely decide the action of the ego vehicle.

The sequence of actions of an object with initial state can then be described by its trajectory , which is a mapping and a flat output for a kinematic vehicle model, along with its time derivatives . For the sake of simplicity, we omit referring explicitly to the time derivatives in the future. The notation refers to the fully described trajectory :


A trajectory ensemble consists of one trajectory per traffic participant . Consequently, a sequence of action vectors from initial states can be described as such a trajectory ensemble .

In contrast to the classical MDP formulation, this approach is not restricted to stationary policies of other agents. Further, it has the advantage that the action set of every traffic participant can be easily restricted or even discretized, not only the one of the ego vehicle. Instead of focusing on an estimate of the probability density function

, we now focus on estimating the future actions of other traffic participants. Our solution to this MMDP is presented in the next section.

Ii-C Optimality

In the following, we review related approaches to this problem from an MDP perspective.

Fixed, Independent Prediction

Many motion planning approaches solve this optimization problem using the following assumption: Other traffic participants perform actions that can be predetermined. Consequently, given an object’s initial state, all future states, respectively its whole future trajectory can be calculated by an upstream prediction module. In other words, they assume policies for other agents that only depend on the object’s own state and the environment . This assumption, that the ego state has no influence on the actions and thus future states of other traffic participants, largely simplifies the optimization problem. The expected costs no longer depend on all states and all actions, but only on the ego state and the sequence of ego actions . Thus, the expected costs only depend on the ego trajectory (cf. eq. (3)). The costs of a planned trajectory, often denoted as the integral over a cost function , can be expressed as:


which corresponds to eq. (1) with where and are assumed to be known for all and impose constraints on . is given by and .

The optimal solution thus can be found via the minimization of this integral:


This approach has for example been applied by Ziegler et al. and has proven to work for a large variety of scenarios [2], [8]. However, as described by the authors, the trajectories were rather defensive and sometimes close to the one of human learners taking driving lessons. This behavior can be explained mainly by the impact of the unknown evolution of the scene that depends on the own action , respectively . Further, the uncertain perception of the status quo was dealt with by using security margins.

Single Agent MDP

Shalev-Shwartz et al. [4] explicitly model the problem as a single agent MDP. For the sake of comparability, we use a cost-based instead of a reward-based formulation. They define the estimated finite horizon costs (cf. eq. (1)) called , regarding the ego action :


assuming a known, deterministic function for the state transition.

The first approach would be to seek for the optimal policy via minimizing :


As, however, the state space quickly explodes using geometric trajectories, they introduce semantic actions and combine them to so-called options or meta-actions. The quality of performing an option is approximated by constructing geometrical trajectories and calculating the finite horizon costs


for every option, normalized by the planning horizon .

This approach, together with the uncertainty considerations from the following subsection, resolves the challenge of uncertain perception. The authors propose to solve the uncertain future challenge by replanning with a high frequency in order to cancel out modeling errors in the dynamics of other agents. While this might suffice for minor errors in dynamics modeling, different decisions of other agents, such as "who goes first when desired paths cross" cause completely different future states. The latter is not addressed by the authors.

Zhan et al. [9]

also model the problem as a single agent MDP. They do not assume deterministic state transitions regarding the state of other traffic participants. Instead, they focus on binary decisions on complementary events. These result from two maneuver options of the other vehicle, for example whether to yield or not at an intersection. The probabilities are computed using a logistic regression model. For every option, the future trajectory of the ego vehicle is computed using the approach of Ziegler et al.

[8]. The motion plan for either option has to be equal for a certain (short) time horizon as long as the probability does not equal zero for one of the options. This corresponds to postponing the decision until one option has probability zero. Given the probabilities, the expected finite horizon costs are then calculated using the mean of the costs for the two plans, weighted with their probability. The optimal plan is found via minimizing the expected finite horizon costs. While this approach is safe, assuming that probability zero equals physical infeasibility of the option, it does also not account for the fact that the ego motion affects the motion of other traffic participants.

Multi Agent MDP

The approaches of Schulz et al. [10] and Hubmann et al. [11] model the problem as a multi agent MDP. Both works motivate their approach by referring to the interdependence of the ego motion and the motion of other traffic participants.

The work of Schulz et al. focuses on the estimation of collective maneuvers, using the concept of trajectory homotopy to semantically distinguish them. After eliminating infeasible and unlikely behavior, they calculate optimal solutions for every maneuver, using a mixed-integer quadratic program (MIQP) to minimize a quadratic global cost function. As they focus on maneuver estimation, safety considerations are not included in their work. Also, the uncertainties in the sensed states are not considered in the cost calculation. Further, the quadratic cost function that is required by the MIQP solver is a strong restriction regarding the goal of modeling human behavior.

Hubmann et al. model the problem as a partially observable MDP (POMDP). They regard the unknown goal destinations of other traffic participants as hidden variables. The state transition model is deterministic for the ego vehicle and the other vehicles, while the action is determined using a fixed motion and interaction model for other traffic participants. For the ego vehicle, the action is determined solving the POMDP using a particle-based solver and Monte Carlo simulation. With this approach, the ego vehicle can implicitly decide to postpone a semantic decision and anticipate the behavior of other agents, as long as they follow the fixed motion and interaction model. Due to the high computational cost of solving POMDPs, however, the possible action space of the ego vehicle is very limited to keep it online-capable. Further, safety guarantees are not included in the approach.

Ii-D Uncertainty

Concerning the sensing uncertainty, we borrow the use of Valiants probably approximately correct (PAC) terminology from Shalev-Shwartz et al. [4]: As the exact state cannot be expected to be sensed, they define a sensing function that returns an approximate sensing state from raw sensor data. The quality of this sensing state is assessed regarding its implication on the ego action: In short, is said to be probably (w.p. of at least ) -accurate, if .

However, there is also an uncertainty in the evolution of the scene. While the latter is not necessarily directly safety relevant, it is largely affecting driving comfort. Simply consider a narrowing with two vehicles approaching. This scenario might easily lead to an unintended deadlock, if the intention of the other traffic participant is neglected.

Hence, we propose to explicitly model the strategy of other traffic participants. This approach can drastically reduce the uncertainty in the future states of other traffic participants. Furthermore, it allows us to get an estimate of the uncertainty per traffic participant, that we can deploy in our driving policy.

Ii-E Safety Guarantees

In public road traffic, a single agent cannot ensure absolute safety only by its own behavior [4]. Consider the following scenario from the cited work: On a highway, one vehicle is enclosed by 4 vehicles to its front, back, left and right. In that situation, the car can neither ensure that none of the other cars will crash into it, nor escape this situation. For this reason, Shalev-Shwartz et al. introduce a concept called Responsibility-Sensitive Safety (RSS). In short, this concept promises, that self-driving cars will never cause an accident. This approach is also followed by Althoff et al. [12], [13]: They calculate reachable areas for other traffic participants, yet excluding behavior that would lead to the sole responsibility of the other traffic participant in case of an accident.

The RSS concept promises to ensure safety in a meaningful way, while allowing normal flow of traffic, i.e. not being overcautious. By deploying it, we are able to guarantee that we will not cause a collision within the next one or two seconds [4], [12], [13]. However, the behavior of self-driving vehicles shall not only be safe, but also comfortable for the passengers and comprehensible for other traffic participants. Thus, also the perceived safety plays an important role.

As mentioned in our previous work, approaches that predict cooperative behavior of other traffic participants potentially decrease safety, as the prediction might be wrong. This risk is particularly high when predicting that others yield. Obviously, a cooperative approach should not lead to a more risky driving policy. On the other hand, absolute safety cannot be guaranteed in road traffic. Thus, we follow the notion of the previously introduced RSS [4]. We propose to back up the cooperative strategy with an analytic risk assessment in order to guarantee that we do not cause a collision: we always keep a plan B which can be adopted. In most cases, this plan B is performing a full stop or swerving and braking simultaneously. We calculate safety distances to other traffic participants and potentially occluded areas, taking our own computation time and others’ reaction time into account, as described by [4]. These distances are computed for every new state received from the perception. Once the actual distance falls below the required safety distance, we instantly perform the respective plan B. With this approach, we can guarantee not to cause accidents according to the principle of RSS.

Iii Probabilistic Global Optimum Approach

The main focus of this paper is to facilitate the generation of comfortable, safe and comprehensible trajectories. Using the MMDP model, we propose the following perspective: The goal of the ego agent is to minimize the expectation of its costs for the complete problem via taking an action at time :


With this model, it is obvious that the optimal action cannot be found without imposing further assumptions or simplifications. The problem even might be ill-posed, if the goal cannot be reached within a finite time such that

. Thus, instead of strictly optimizing or applying machine learning to a very simplified problem, we propose to strive for an approximate solution of a problem that is closer to the actual one. Having said that, safety must of course be guaranteed. Its separate treatment using analytical methods can be motivated by the very high cost that collisions impose.

In the following, we shortly describe the basic idea behind the approach, before explaining our way of dealing with the uncertain perception and the uncertain prediction along with giving safety guarantees.

Iii-a Basic Idea

Firstly, the horizon of the planning problem is largely reduced by using a state of the art navigation approach to compute the approximate path to be traveled, similar to a human using a navigation device.

Within this narrowed planning horizon, as motivated previously, our main goal is to plan safe trajectories. Obviously, we will never put safety at risk in order to gain comfort. Following the goal of minimizing our expected costs (cf. eq. (9)), our approach is to drive mainly comfortable while risking rare uncomfortable maneuvers. The latter might for example be a response to very unlikely behavior of other traffic participants. Safety is always guaranteed by following the RSS concept (cf. Section II-E), assuming a PAC sensing system (cf. Section II-D).

The basic idea behind our driving policy is to behave comprehensible and thus allow for cooperation and cooperate with other traffic participants. By doing so, we are able to anticipate the actions of others and thus, the evolution of the scene, with less uncertainty. In other words, we have less discrepancy between the estimated future states and the actual future states

. As in game theory, optimizing jointly and behaving presumably can unveil solutions that are globally optimal. Hence, by and large, these solutions are better for all traffic participants.

Iii-B Refined Problem Statement

To achieve this, we explicitly model the strategy of other traffic participants. They are no longer treated as obstacles, but as other agents that one can cooperate with. In order to model the strategy of others, we assume that they also seek for a comfortable and safe trajectory. Consequently, we investigate their costs for potential future states. Even though we can only directly control our ego vehicle, we seek to find a trajectory ensemble close to the global optimum.

As presented in our previous work, the cost functional for this globally optimal () finite horizon approach is:

with singleton trajectory costs for vehicle  and pairwise trajectory costs for vehicle  due to vehicle .

The optimal trajectory ensemble can then be found via minimizing . As stated previously, we propose to strive for an approximate solution of this highly non-convex problem and therefore apply sampling. The best solution out of samples simply is such that

In this approach, we are also faced with uncertainties, regarding the accuracy of our model: The sensed state , building the start of the future trajectories (cf. eq. (3)), is subject to uncertainties. Further, the action sequence of all traffic participants but the ego vehicle is only an expectation. If a traffic participant deviates from our globally optimal plan, obvious and possible explanations are (a) he does not act according to a (stationary) policy , (b) we estimated his costs or destination wrongly or (c) he estimated our costs wrongly. For the sake of simplicity, the desired paths, respectively the desired destinations of the traffic participants, are assumed to be known for now. This assumption is relaxed later on.

Consequently, the optimal costs are only reached with a certain probability, while violations of the previous assumptions can cause higher costs. The refined, probabilistic () cost expectation is:


containing two unknowns: the probability that the globally optimal plan is followed and the expected costs in case the latter is not followed.

As neither humans nor this approach strive for strict optimality, the probability is not the probability that the globally optimal plan is exactly followed. Instead, it is the probability that any plan that is close to is pursued. In order to define the term "close to" and to check how much those uncertainties in perception and prediction affect the probability , we additionally consider the following marginal cases: Every traffic participant can be a dynamic driver or a defensive driver. That is, the cost parametrization is chosen accordingly. We check whether all permutations of the cost parametrizations yield the optimum in the same homotopy class, employing the concept of trajectory homotopy of [10] and [14]. An example for different homotopy classes can be seen in Figure 0(b) and for equal homotopy classes in 0(c). Further, the marginal cases of the PAC sensing system are included here in order to consider the marginal case of the worst combination of uncertain perception and uncertain prediction.

If the marginal cases yield optima in different homotopy classes, the optimal solution of the situation is ambiguous and thus, the probability of our globally near-optimal plan is lowered. One way of dealing with this situation would be to force a solution in a certain homotopy class, by assuming that others will certainly try to avoid a collision, in the hope of increasing . However, the consequences of such behavior are hard to assess. Rather, we consider the second unknown of eq. (10), the expected costs of the non optimal plan , which becomes increasingly important. Instead of trying to force a behavior, we check the costs for a more defensive maneuver, such as refraining from entering a potential collision zone, and consider these as .

If the costs of are very low, for example because the potential collision zone is far away, we pursue and do not act overcautiously. If, however, the costs of are high, while the probability is also high, we do no longer trust that all participants consider a solution in this homotopy class as the globally optimal plan . Hence, we change to the more defensive maneuver. When the homotopy class changes, we try to find the globally optimal solution again and build trust in this solution. Eventually, we will build trust in the new homotopy class, such that we switch back to again, or the situation will be solved defensively, i.e. with less risk, e.g. because we drive slowly and is low.

Given an initial globally optimal plan with one distinct homotopy, is set to a high value. From then on, we constantly check on every state update, whether the other traffic participants’ behavior lies within the expected range. Note, that this is a violation of the Markovian assumption. However, one could introduce additional state variables describing the trust or the non-compliance of an agent with our assumption. If we detect a violation of this explicable behavior, this also increases the uncertainty in our estimation of others’ driving strategy and thus the estimates for their future states. This behavior might for example be due to a maneuver that was not anticipated, such as parking along the road or avoiding an unforeseen obstacle, or due to a very dynamic driving policy that exceeds the scope of our considerations. It can be detected through a rise in the costs of the globally optimal plan, through a violation of the state limits determined by the marginal cases, or through a shift of the time at which traffic participants enter and leave a potential collision zone. In case the deviation questions our homotopy choice, i.e. an agent that is supposed to go second accelerates, we decrease .

Iii-C Intention Consideration

At intersections, agents mostly have the choice between several routes. Neglecting emergency situations, this decision can be considered independent of the behavior of other agents, as it is made in an upstream navigation layer, also for human drivers [15]. Thus, this route estimation can also be made by an upstream module, as presented by Petrich et al. [16]. As input to the planner, we receive different route combinations for the ensemble of traffic participants along with their probabilities . With this input, we can relax the assumption of knowing the desired destination of the traffic participants, considering the cost expectation


with from eq. (10) and . Note, that the ego trajectory must be identical for all up to the output time of the subsequent planning step. With this condition, analog to Hubmann et a. [11], we are able to postpone decisions as long as multiple route hypothesis are evident. Similar to human drivers, instead of over-cautiously reacting to all possible predictions, we act accordingly for the likely predictions, and perform rather sharp maneuvers, when we encounter an unlikely action.

Iii-D Safety Consideration

As the previously introduced approach of solving a complex MMDP cannot give guarantees regarding safety or even finding a feasible solution, the approach needs to be backed up. Thus, we apply the RSS concept of analytically computing safety margins regarding the physically feasible and lawful motion of all other agents. As soon as we observe a violation of those safety margins, we react with the appropriate response (cf. [4]), which in most cases is a full deceleration, until the safety margin is satisfied again. This computationally cheap check can be done with a very high frequency, allowing to use a small ego reaction time. The latter results in a smaller safety distance and thus facilitates a larger scope for cooperative maneuvers.

Iv Results and Evaluation

In this section, the approach is evaluated for two scenarios, passing an intersection and passing through a narrowing of the road (cf. Figures 2 and 4). The narrowing shows the benefits of the approach regarding scenarios where the right of way is not predefined. At the intersection, the benefits of the intention consideration are shown.

In order to evaluate our approach, we implemented it in our ROS based simulation framework CoInCar-Sim [7] using python. Many cooperative scenarios, including the previously mentioned, have highly constrained driving corridors. Thus, we applied path-velocity-decomposition, as introduced by [17]. The paths were generated from a polygon depicting the centerline of the lane. Further, we used a naive jerk sampling approach to generate multiple trajectory candidates per traffic participant. In order to do so, we discretized the path in space () and determined potential collision zones. Jerk samples that violate the kinematic restrictions are replaced by the respective marginal jerk, thus implying a bias on marginal cases. We assumed a planning deadtime of and chose accordingly. Thus, an action planned at time assumes its motion in to be known and is effective in the interval .

We chose the planning horizon dynamically, such that the potential collision zone is passed by both traffic participants, but bounded by . For the cost computation, we consider the velocity, the normal and the tangential acceleration. For the perceived safety, we consider the time of zone clearance, depicting the time that elapses between the first vehicle leaving potential collision zone and the second vehicle entering this area. A more detailed overview is given in our previous work [6]. Instead of introducing end costs for unsafe final states with , we discarded those samples. In order to prove safety according to the RSS concept, we compute the safety margins analytically with a frequency of . We consider the following cases: If we can stop before entering the potential collision zone, our state is considered safe. If we cannot stop before entering the potential collision zone, but the safe longitudinal distance (Def. 2 of [4]) is satisfied, our state is considered safe. If we leave the collision zone according to our pursued plan before the other vehicle is able to enter it, considering its physical limits, our state is also considered safe.

The probability is initialized to be . At the beginning of each subsequent planning, it is investigated whether the other traffic participant behaved within the expected boundaries. This is done via checking whether the total costs raise by more than factor . If yes, is lowered by , if not, is lowered by . If the defensive maneuver after the subsequent planning step would impose high costs 333In our case, costs of a deceleration with more than ., we only continue the plan if its probability is high: . Otherwise, we change to the defensive maneuver.

Figure 2: Narrowing, without signposted right of way and with one predefined path per agent.
Figure 3: Solutions to the narrowing scenario (Fig. 2): Classical approaches that mutually predict the other traffic participant with constant velocity result in a deadlock (1) by stopping in front of the potential collision zone (grey). With the proposed approach the vehicles mutually include their behavior and thus act globally optimal (2). In (3) the right vehicle is driving ignorantly. The left vehicle, running our approach, detects this and reacts early by yielding.

Iv-a Narrowing

At the narrowing, we consider three cases. The driven trajectories are visualized in Figure 3: Two automated vehicles, using a classical approach with a constant velocity prediction, as in [8], two automated vehicles using our approach, and one vehicle using our approach together with an ignorant driver that does not consider us. Even though the situation seems obvious to humans, classical approaches lead to a deadlock in this situation, as they neglect their mutual influence on each other. Their optimization problem yields two local optima, none of which is close to the actual global optimum. With the proposed approach, however, the mutual influence is considered. Thus, for this unambiguous situation, a global optimum is found and pursued by both vehicles. As claimed, this behavior is comfortable, also with respect to perceived safety, and comprehensible for other traffic participants. Further, it is provably safe. If we detect violations of our assumptions considering the behavior of others, increases and we choose a more defensive plan. Hence, we even react comfortably to an ignorant driver, while other approaches would have to perform sharp maneuvers if their assumptions are violated.

Iv-B Intersection

At the intersection, the route of the other vehicle is not known but estimated by an upstream module. The trajectories driven for different probabilities of the two routes are visualized in Figure 5. We are able to reproduce the key result of Zhan et al. [9] and Hubmann et al. [11]: We implicitly postpone a decision, while acting in order to minimize the expected costs. Still, we consider the mutual influence and obey the traffic rules without defining homotopy constraints.

Figure 4: Intersection: The other traffic participant (black) has the right of way and can drive straight on path or turn right on path . The path of the ego vehicle (blue) is known .
Figure 5: Solutions to the intersection scenario (Fig. 4): In both cases, the other vehicle drives straight on (potential collision zone in grey). In case (1), the planning module was fed the (wrong) information . Thus, it reacts "surprised" and brakes late. In case (2) it was fed . Thus, it pursues the globally optimal plan for from the beginning.

V Conclusions and Future Work

In this paper, we reviewed the problem statement of behavior generation and motion planning for automated vehicles. We propose to model the problem as an MMDP with deterministic state transitions. This model allows to incorporate the prediction of other traffic participants in an integrated approach. Further, the consequences of wrong assumptions concerning the behavior of other traffic participants are explicitly considered. Also, estimations of upstream modules such as a route prediction for other traffic participants are made use of. The resulting behavior of a vehicle following the proposed approach is comfortable, safe and comprehensible. Scenarios that are obvious to humans are solved human-like: E.g. if one vehicle is closer to a narrowing than the other, it drives first. If we are confident, that a vehicle that has priority will not intersect our path, e.g. by turning right in front of us, we drive as we would not have to give way but still keep this option open, and react more harshly in case our assumption was wrong. Traffic rules such as the right of way are modeled by regarding the time of zone clearance, instead of explicitly excluding certain homotopy classes.

The authors intend to further pursue the approach: In order to be able to stick to the cooperative plan as often as possible, an important future work is to improve our model of comfort and perceived safety in the cost functional. Further, we intend to investigate different directed sampling methods and port the algorithm to our probe vehicle "Bertha" in order to test the approach in real mixed traffic. In order to investigate the probability of the optimal plan and the expected costs of the suboptimal plan more profoundly, simulator studies with human drivers can be conducted.


We gratefully acknowledge support of this work by the Tech Center a-drive and by the Deutsche Forschungsgemeinschaft (German Research Foundation) within the Priority Programme “SPP 1835 Cooperative Interacting Automobiles”.


  • [1] K. Bengler, K. Dietmayer, B. Färber, M. Maurer, C. Stiller, and H. Winner, “Three Decades of Driver Assistance Systems - Review and Future Perspectives,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 4, pp. 6–22, 2014.
  • [2] J. Ziegler, P. Bender, M. Schreiber, H. Lategahn, T. Strauss, C. Stiller et al., “Making Bertha Drive - An Autonomous Journey on a Historic Route,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 2, pp. 8–20, 2014.
  • [3] SAE, “Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles - J3016_201609,” 2016.
  • [4] S. Shalev-Shwartz, S. Shammah, and A. Shashua, “On a formal model of safe and scalable self-driving cars,” arXiv preprint arXiv:1708.06374, 2017. [Online]. Available:
  • [5] A. Hütter, Verkehr auf einen Blick.   Federal Statistical Office Germany, 2013.
  • [6] M. Naumann and C. Stiller, “Towards Cooperative Motion Planning for Automated Vehicles in Mixed Traffic,” in IEEE/RSJ Intl. Conf. Intelligent Robots and Systems Workshops, Sep 2017. [Online]. Available:
  • [7] M. Naumann, F. Poggenhans, M. Lauer, and C. Stiller, “CoInCar-Sim: An Open-Source Simulation Framework for Cooperatively Interacting Automobiles,” in IEEE Intl. Conf. Intelligent Vehicles, Jun 2018, (to appear).
  • [8] J. Ziegler, P. Bender, T. Dang, and C. Stiller, “Trajectory planning for Bertha - A local, continuous method,” in IEEE Intell. Vehicles Symposium Proc., June 2014, pp. 450–457.
  • [9] W. Zhan, C. Liu, C. Y. Chan, and M. Tomizuka, “A non-conservatively defensive strategy for urban autonomous driving,” in 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Nov 2016, pp. 459–464.
  • [10] J. Schulz, K. Hirsenkorn, J. Löchner, M. Werling, and D. Burschka, “Estimation of collective maneuvers through cooperative multi-agent planning,” in 2017 IEEE Intelligent Vehicles Symposium (IV), June 2017, pp. 624–631.
  • [11] C. Hubmann, J. Schulz, M. Becker, D. Althoff, and C. Stiller, “Automated driving in uncertain environments: Planning with interaction and uncertain maneuver prediction,” IEEE Transactions on Intelligent Vehicles, vol. 3, no. 1, pp. 5–17, March 2018.
  • [12] M. Althoff and J. M. Dolan, “Online verification of automated road vehicles using reachability analysis,” IEEE Transactions on Robotics, vol. 30, no. 4, pp. 903–918, Aug 2014.
  • [13] M. Koschi and M. Althoff, “Spot: A tool for set-based prediction of traffic participants,” in Proc. of the IEEE Intelligent Vehicles Symposium, 2017, p. 1686–1693.
  • [14] P. Bender, Ö. Ş. Taş, J. Ziegler, and C. Stiller, “The combinatorial aspect of motion planning: Maneuver variants in structured environments,” in IEEE Intell. Vehicles Symposium (IV), June 2015, pp. 1386–1392.
  • [15] E. Donges, “A conceptual framework for active safety in road traffic,” Vehicle System Dynamics, vol. 32, no. 2-3, pp. 113–128, 1999.
  • [16] D. Petrich, T. Dang, G. Breuel, and C. Stiller, “Assessing map-based maneuver hypotheses using probabilistic methods and evidence theory,” in 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), Oct 2014, pp. 995–1002.
  • [17] K. Kant and S. W. Zucker, “Toward efficient trajectory planning: The path-velocity decomposition,” Int. J. Rob. Res., vol. 5, no. 3, pp. 72–89, 1986.