I Introduction and motivation
The number of vehicles on the road is constantly increasing. According to Sousanis [sousanis2011world], the billion units mark was passed in 2010, and, with the current growth rate, the tendency appears to remain the same for years to come. With an increasing number of vehicles on the road, there is a rising interest and necessity of analyzing and planning trajectories in complex multi-modal traffic scenarios. One goal for trajectory planning is the implementation of vehicle active safety systems for collision avoidance and mitigation [muller2016statistical]. Globally, there are more than 1.2 million traffic-accident related fatalities per year [world2015global]. Another goal is to enable fall-back or self-supervising systems for autonomous vehicles.
In this paper, we define “complexity” as the combination of the following factors: an undetermined number of moving objects to be considered, a large and undefined number of available options or “trajectories” these objects can follow, and an extensive variety of roads where vehicles can drive on. Because of this complexity, the amount of information that has to be acquired, processed, and assessed for autonomous driving and vehicle active safety systems is accordingly huge.
One possible solution are machine learning methods. A major advantage of these methods is the ability to analyze huge amounts of data in short time periods. However, they also pose new challenges: First, a comprehensive dataset is required in order to train machine learning models. The compilation of a diverse and complete enough database, that ensures that each and every open street scenario is covered, results in an impractical approach due to the time and resources this would cost. Current work in this area focuses only on a limited set of scenarios[bojarski2016end]
. Second, data labeling can be problematic if a supervised machine learning method is chosen. Automated processes might not have the required classification accuracy for automotive safety systems and manual processes are very costly and time consuming. Third, most machine learning models are not interpretable. Rapidly evolving applications like cyber security profit greatly from methods like deep learning[McAfee]. Yet, ethical and legal implications of false-positive actuations make machine learning methods inadequate for safety-critical applications. Research on algorithm validation for automated driving functions focuses on a limited set of scenarios, too .
Considering the previous points, we favor a model-based approach. Here, the biggest challenge is the processing of complex and costly computations, which results in long execution times. However, we will show that we can obtain execution times comparable to machine learning methods, but with deterministic outputs. Since all objects (vehicles and pedestrians) are mobile, no information from previous simulations can be reused: each traffic scenario has to be simulated every time from ground up. Even when the objects can be tracked over time, the trajectories that they are able to follow do change over time, altering the final outcome of the current traffic situation. Considering this and the runtime constraints for applications for vehicle safety systems, it is imperative to design the algorithm for parallel processing.
In this paper, we introduce an approach that deals with the task of traffic analysis for criticality estimation and motion planning in a fully model-based fashion. As a result, a large number of possible trajectories for all traffic participants have to be computed and assessed. For this work, we define “criticality”
as the probability that the current traffic situation will lead to a collision for the own (EGO-) vehicle. The huge advantage of such an approach is that it can be easily tested, modified, and validated in simulation, which are key components for vehicle active safety algorithms. The challenge of this approach is the high computational costs related to predicting trajectories in multi-object scenarios. However, we will show that such an approach can run—when executed in parallel—inms on embedded hardware that is available for modern cars.
This paper makes the following contributions:
Ii Related Work
Trajectory planning and risk assessment are topics of continuous research in the automotive area. These overlap, but differ on their main task. The first one focuses on finding a route A to B that fulfills certain criteria that can include collision avoidance, but tends to neglect certain possible trajectories for sake of efficiency. The second one focuses on estimating the criticality of a traffic situation, for which the future states of the objects are somehow planned and/or predicted.
The work at hand is based on existing basic ideas like trajectory generation and collision recognition. It addresses key critical points of existing approaches that a) have been avoided by other authors because of the complexity or computational cost, b) were not considered combined until now and c) impact noticeably the final criticality estimation. The algorithm also differentiates itself by its flexible, highly parallelizable architecture, which is a base requirement for GPU-parallel mapping. These critical points are addressed following by comparing the present work with related publications.
Broadhurst et al. [broadhurst2005monte] present a method for reasoning how vehicles move, when in presence of other traffic participants. The authors neglect several relevant aspects of the trajectory generation, such as mechanical latencies and limits. A purely geometrical model is used for describing vehicle motion, which generates non-driveable trajectories. This and the use of constant steering instead of a steering controller shift greatly the scenario criticality as shown in Figure 7. The use of random components also complicates the algorithm validation for use in vehicle safety functions.
Broadhurst et al. [broadhurst2005monte] is a very adequate example of why it is necessary to approximate conditional probabilities as we will shown in Section IV-D2. As stated by Broadhurst et al. in Section II-F, the collision probability is calculated sequentially, which is time consuming. Their algorithm applied to a single object operates at Hz, while the algorithm presented in this work processes trajectory combinations of 4 objects (EGO + 3 CO) in the same time period.
Lefèvre et al. [lefevre2014survey] present a comprehensive survey on existing criticality estimation methods. They perform semantic classification and show trade-offs of the different classes. The results of this paper indicate that by using a smart, parallelizable algorithm structure, it is possible to obtain criticality estimations with a quality similar to that of interaction-aware motion models (Section IV-D), but using maneuver-based models (Section IV-B), which opens the door for real-time risk assessment.
Ziegler et al. [ziegler2014trajectory] and Ferguson et al. [ferguson2008motion] present real-world tested motion planning algorithms. Both rely on representing the motion planning as a constrained optimization task, where the trajectories are represented by cost functions that include a “no collision” constraint. The trajectory given by the optimized cost function is driven, even if it is the only collision-free trajectory. This implies that errors in the sensor information or actuator control, unexpected CO behavior, modeling inaccuracies or wrong assumptions could lead to collisions. So, in highly dense traffic situations, unnecessary risks might be taken since only a small subset of the physically feasible trajectories are considered. That a trajectory is unlikely to happen, does not mean that it cannot happen.
Note that the goal of the present algorithm is not to completely substitute current path planning or collision recognition approaches. Redundancy is a best practice where humans can be harmed due to system malfunction. As stated in Section I, the present algorithm can serve as a fall-back, supervising or plausibility system for both autonomous and human-driven vehicles.
Iii Multi-Hypotheses Approach
For estimating the criticality of a traffic scenario, an algorithm that can model interactions between multiple objects present in an area of interest is necessary. Not knowing exactly which trajectory will be followed by each of these objects generates uncertainty about the future. However, this uncertainty can be modeling by generating multiple hypotheses for each object. This addresses the possible motion options of the objects. Considering the previous, this algorithm was designed as a fully model-based multi-hypothesis algorithm.
A model-based approach has key benefits for passive and active vehicle safety systems. First, it enables the future integration of infrastructural elements (such as more lanes) and a larger number of static and dynamic objects in the computation. This could improve the precision, reliability and relevance of the obtained results. Second, this approach is based on domain knowledge and widely accepted mathematical models. This ensures deterministic outputs, allowing an easy validation of the algorithm. Finally, the model-based generation of statistical quantities for prediction tasks is important for automotive safety systems.
We define a “hypothesis” as the combination of a specific acceleration profile and a path that correlate over time; that is, one unique trajectory. A main advantage of the multi-hypothesis strategy is that the number of threads that are executed in parallel is easy to influence, as it depends on easily adjustable parameters such as the number of hypotheses to be simulated or the amount of objects to be considered. With this in mind, it is important to note that the more hypotheses that are simulated, the better coverage of all possible states that an object might traverse in the near future. Therefore, the maximum possible accuracy of the algorithm can be scaled depending on available parallel computing resources.
The algorithm is composed by four modules which have to be executed in a given order, and inside these modules, there are tasks that can be executed in parallel. The modules and the computational structure of the algorithm can be seen in Figure 1.
To manage the parallel execution of the algorithm, a set of matrices were structured to keep the information traceable and independent, and nested indexes had to be generated to allow the information to be retrieved. These are module-specific and it is precisely these indexes that express the parallelization possibilities of the corresponding tasks. This will be explained in the corresponding modules.
Iv Algorithm Process
In order to detect collisions between two objects, their pose (coordinates of center of gravity and orientation) over time (trajectory) and their shape has to be known. This means that each and every prediction has to go through all four modules, which will be explained in the following.
Iv-a Street Data Processing
One of the constraints for the trajectory generation is the road infrastructure, meaning that the vehicles are bound to it under conventional traffic circumstances (e. g. not going off-road). Because of this, the trajectories will be generated according to the lanes present on the driveable road. Considering that the focus of this work is not the street modeling, the assumption is made that the lane information is delivered from the sensors to the algorithm in the form of sets of three pairs of (x, y) coordinates. Using exteroceptive sensors in vehicles, this representation is realized in modern cars. Each one of these sets corresponds to a specific lane divider, and each lane is delimited by exactly two lane dividers. Adjacent lanes share one lane divider.
From each set, it is assumed that two coordinates correspond to the closest and farthest detected points of the lane divider, and the third one is any point in between. From these three points and using the Gauss-Jordan method [gaussjordan], a second degree equation is obtained, which corresponds to the mathematical representation of the associated lane divider. A graphical explanation of this can be seen in Figure 2.
The algorithm is designed to consider up to three lanes: the own (EGO-), immediate left, and immediate right lanes. If the sensors do not deliver information for neighbouring lanes, they will not be considered, and no trajectory will be generated in the corresponding area. If the information of the EGO-lane is missing, a “virtual” lane will be generated according to the current vehicle state and applicable legislation [RAADeu].
Having the mathematical representation of the driveable road, the vehicles are associated with their corresponding lanes. The EGO-vehicle will always be placed on the EGO-lane. All other vehicles are associated with the lanes according to the location of their estimated center of gravity (more on vehicle parameters later). Pedestrians are not bound to the road infrastructure, so no association is performed for them.
For this module, up to 4 threads can be computed in parallel, one for each lane divider.
Iv-B Trajectory Generation
Having the mathematical representation of the road (Section IV-A), many hypotheses for each object are computed by means of motion models as described in the following.
Iv-B1 Motion Models for Vehicles
Under tractive driving (i. e. not exceeding the grip limits of the tires), longitudinal and lateral dynamics of the vehicles are coupled. For this, motion models are used for generating the trajectories of the vehicles.
are the mass and moment of inertia of the vehicle, andis the front steering angle at the tires.
For both motion models, once having the corresponding state variables, the pose of the vehicle is calculated by means of Euler integration as described by the following equations:
where and are the sine and cosine functions, subscript denotes a vehicle, is the time lapse between instances and , are the coordinates in global frame, and are the accelerations in vehicle coordinate frame. The input will be addressed in Section IV-B2.
The mass of the vehicles plays an important role in traffic accidents [iihs2009carsize]
and it is a parameter for the OT model. Because of this, the COs are classified according to their dimensions to assign them a mass value. For this, it is assumed that the on-board sensors of the EGO-vehicle can deliver approximate information about the dimensions of the COs. If the height of the COs cannot be estimated,classes are used: quadricycle, supermini, small family car, large family car, executive, and multi-purpose vehicle. Otherwise, extra classes are used: off-roader and cargo. This accounts for the different length/width-to-weight ratios that taller vehicles have, when compared to the other classes. The classes are chosen according to Van Miert [cartype] and average values are obtained from Heydinger et al. [vehicleinertia]. Typical values for tire parameters and vehicle geometry are obtained from Isermann [isermann2006fahrdynamik].
Iv-B2 Variation for the Vehicle Trajectory Generation
The generation of multiple hypotheses requires to influence both longitudinal and lateral dynamics of the vehicles.
a) For the longitudinal motion, either the proper acceleration or longitudinal slip is sampled. Two samples are the maximum and minimum of the corresponding range, and one is zero (vehicle is cruising). The remaining samples are equally distributed in the negative region to favor a reduction of kinetic energy of the bodies. This is very robust in critical situations. The positive area is also sampled to address
cases where accelerating would avoid an accident, like a
rear-end collision. The ranges go from to (OT) and the longitudinal slip from to (TT). Furthermore, mechanical latencies and jerks are introduced to generate profiles that are present in road vehicles. This aids to maintain realistic trajectories. The number of profiles are equal for the EGO-vehicle and COs. The output of the profiles is , which is an input for the corresponding motion model. Examples of these profiles can be seen in Figure 3.
b) To get feasible and realistic trajectories, a controller is designed and implemented for lateral dynamics. First, a reference trajectory is generated with a motion model and the expected driver behavior. That is, the vehicles keep driving with their current acceleration towards the middle of their current lane. Along this trajectory, the available lanes are sampled at three predefined instances of the prediction time (, and ). The sample points are equally distributed on each lane and perpendicular to the lane divider. Three samples are made on the own lane and two samples are made on the neighboring lanes. This reflects the expected behavior of the vehicles: it is more likely that they will stay on their current lane, rather than steer towards the neighboring lanes. Out of these sample points, the path sections are generated. These sections go over the sample points and are parallel to the lane dividers (see Figure 2). Three of these path sections (one from each sampling instance) represent one complete path. The path sections are the input for the controller and the output is the steering angle at the front wheels . This is then used as input for the correspondent motion model. It was decided to generate more complete paths for the EGO-vehicle than for the COs. This takes into account that one is able to influence the EGO-vehicle, but not the COs.
The lateral controller has been optimized using a large number of simulations and is expressed mathematically by
Iv-B3 Motion Model for Pedestrians
Research on pedestrian motion modeling has been done by Schlake [pedestrian1]. Factors like emergency situations and pedestrian density do affect the motion of pedestrians, but these motion models focus on in-building situations. In open-air environments, the movement of pedestrians is not bound to the road infrastructure. Because of this, the pedestrian motion is expressed by a kinematic model with no controller:
where subscript denotes a pedestrian.
For pedestrians, the samples are equally distributed from to for and from to for . The velocity is limited to assuming that pedestrian velocities on open roads range from slow to running, rather than sprinting [zkebala2012pedestrian]. The number of samples of is equal to the number of complete paths for COs and the number of samples of is equal to the number of profiles for vehicles.
Iv-B4 Parallelization of Trajectory Generation
A key for computing so much information in parallel is to generate the trajectories so that they are completely independent from each other. In Section IV-B, the generation of trajectories by combining acceleration profiles and paths is explained. It is precisely this combination that allows the generation of a large number of independent trajectories.
The total number of trajectories that can be simulated in parallel is equal to
where is the total number of COs considered, is the number of acceleration profiles, is the number of complete paths for the COs, and is the number of complete paths for the EGO-vehicle.
In this paper, the number of trajectories for the EGO-vehicle is equal to . This results from the linear combination of the path sections which are also combined with acceleration profiles. As stated in Section IV-B2, one is not able to influence the COs. For this reason, no combination of the path sections is made for COs. This means that each CO has complete paths, which are combined with acceleration profiles to generate trajectories. For example, trajectory generations can be executed in parallel when .
It should be noted that the used motion models are represented by differential equations that are solved numerically for each time instance of each trajectory (Section IV-B1). This means that the current position has to be known in order to calculate the future position of an object. Following this train of thought, one limitation of parallelizing is that it is not possible to simulate all time instances of one single trajectory in parallel—they have to be calculated sequentially.
Iv-C Collision Recognition
Iv-C1 Object Modelling with Polygons
Once the trajectories are generated, it is possible to check if a collision between the EGO-vehicle and a CO occurs. For this, the complete set of hypotheses of each CO is combined with each and every hypothesis of the EGO-vehicle. Both objects are then modeled as polygons, and it is checked at each time instance whether they overlap or not. If the polygons overlap, this indicates that a collision occurs at the given time instance. Well known point in polygon strategies exist to test for this [shimrat1962algorithm, haines1994point]. The method requires to check the overlapping twice per time instance: EGO over CO, and CO over EGO. The polygons can be as complex as necessary. The more complex they are, the better the objects are described, but the more runtime is required. This is very relevant for the computational resources, since the collision recognition is the module that takes the most runtime and the overlapping check is the function that is called most frequently.
Combinations of hypotheses between COs are not taken into account, seeing that it is not the focus of this work to predict collisions that do not involve the EGO-vehicle.
Iv-C2 Parallelization of Collision Recognition
As mentioned in Section IV-B4, all trajectories are completely independent from each other. Thus, any combination resulting from them is independent as well to be simulated and evaluated by different processing units in parallel. The resulting number of trajectory combinations that could be simulated in parallel is given by
As an example, for a case with , , and , trajectory combinations are executed in parallel in this module in ms.
In this work, a maximum prediction time of s and a discretization time of ms is used for any given scenario. This means that the overlapping check function will be called 200 times for each trajectory combination, yielding a total number of calls per scenario for this routine alone that are executed in ms for this example.
The length and width used for cars in the class with the smallest vehicles is m and m, respectively. Assuming the two specific collision cases of a T-bone and a purely longitudinal one, the vehicles need a relative velocity of and accordingly for them to drive "through" each other and the collision to be undetected. It is known to the authors that the collision configurations are infinite, and that special cases, such as a small overlap, do happen. However, this shows that the discretization time covers a very comprehensive range of street scenarios.
Iv-D Risk Assessment
The risk assessment is based on the anticipation time and criticality. We define “anticipation time” as the time in advance that the criticality of a situation can be recognized. For this, once it is determined that a combination of trajectories leads to a collision, the probabilities of these trajectories are multiplied, obtaining the probability that this combination will occur. Assuming statistical independence (Section IV-D2), the probability that a traffic scenario will lead to a collision is given by
where is the indicator function that yields one if and only if the -th EGO and the -th CO trajectory lead to a collision and zero otherwise, , , is the probability of the -th EGO trajectory, and is the probability of the -th CO trajectory.
Iv-D1 Trajectory Probability Calculation
To get the probability that a specific hypothesis can occur, it is scored against the reference trajectory mentioned in Section IV-B2. The lower the score, the lower the occurrence probability. The scoring value is given as follows:
where and are penalizing factors for the complexity of the maneuver and for entering lanes with counter traffic; and are the scoring factors for the acceleration profile and the path, and and are fixed weighting factors for the acceleration and steering correspondingly. The scoring and penalizing factors steer the occurrence probability of the hypotheses, thus the criticality of the traffic scenario as well. Should colliding trajectories have a higher occurrence probability, the criticality of the scenario will be higher too. One way of optimizing the parameters in practice, is according to the expected passenger injury for each type of collision. The used parameters are chosen using domain knowledge.
Once all trajectories of an object are scored, the scores are normalized with respect to the -norm. The resulting values are considered as occurrence probabilities of the trajectories. This process is repeated for each object.
Iv-D2 Conditional Probability
In Section IV-D1, an occurrence probability for each hypothesis of each object is calculated. Having multiple COs that can collide with EGO at different time instances, the conditional probability of these collisions has to be considered. The implementation of conditional probabilities could give a marginal benefit for representing the behavior of the criticality, but the required mathematics prevent the algorithm from being parallelizable.
To maintain the algorithm parallelizable, the conditional probability is approximated. For this, it is assumed that an EGO-CO hypothesis combination can occur if and only if no other collision occurred before along the corresponding EGO-trajectory. Thus, the EGO-CO combinations are considered to be independent, and their probabilities are scaled down in chronological order. That is, the first hypothesis combination to occur in time maintains its estimated probabilities (“collision” and “no collision”). Then, the probability that the next hypothesis combination occurs, is equal to the “no collision” probability of the first combination. Both “collision” and “no collision” probabilities are then scaled down accordingly. This process continues until all the probabilities of all the combinations are scaled.
The following equations explain this further and a graphical representation of this can be seen in Figure 5
V Algorithm Mapping to Embedded Systems
The proposed algorithm is designed such that it can be executed in parallel by thousands of threads. While each module in Figure 1 depends on the results of its predecessor, the modules themselves are highly parallelizable. We target embedded systems that offer embedded, on-board GPUs as execution platform. For automotive applications, there is a variety of hardware platforms offered by different manufacturers such as Renesas (R-Car), NXP (i.MX), Texas Instruments (OMAP Jacinto), Qualcomm (Snapdragon), Intel (GO), or NVIDIA (Jetson).
For real-time performance, it is essential to map the algorithm efficiently to the available hardware resources and to make best use of them. This requires typically low-level programming and ties the implementation to one specific architecture. To avoid this, we use the AnyDSL111https://anydsl.github.io framework that allows to separate low-level hardware-specific aspects from the high-level algorithm description [leissa2018anydsl, leissa2015shallow]. AnyDSL supports code generation for GPUs by generating CUDA and OpenCL. The algorithm description for CPU and GPU are the same, only a hardware-specific mapping is required for each target platform:
Iteration Logic: Defines in which order data is processed in each module. This can be sequential on the CPU vs. data-parallel on the GPU.
Hardware Intrinsics: Many trigonometrical functions such as sine or cosine can be mapped to much faster hardware-accelerated versions on the GPU.
Memory Hierarchy: Modern CPUs and GPUs have a deep memory hierarchy. Some of them require explicit programming.
Memory Management: CPU and GPU share the same physical memory in most embedded systems.
In our implementation, we exploit all those hardware-specific features mapping all modules to the GPU. In particular, the resulting implementation executes all modules in a data-parallel fashion, makes extensive use of hardware intrinsics for the collision recognition (sine, cosine, tangent), and requires no data-transfers between CPU and GPU, exploiting unified CPU/GPU memory. For deployment, we use Thrift222https://thrift.apache.org to retrieve input data from the vehicle and return the results of to our algorithm.
Vi Evaluation and Results
Vi-a Algorithm Outputs
The evaluation is performed first by designing a set of 20 different simulated scenarios that cover in a wide, general manner possible traffic situations. These scenarios include combinations of one, two, and three lanes, COs as static and moving objects, as well as counter traffic and vehicles approaching from behind. The Figure Figure 6 shows an example of a complex traffic scenario that can be handled by the proposed algorithm. The second part of the evaluation of the algorithm includes 547 real-life traffic situations. For this, a vehicle is driven on open roads and the information obtained from the sensors is collected and used as an input for the algorithm for off-line evaluation.
An evaluation metric of the algorithm stability is the development of the predicted criticality over time. A smooth, progressive development indicates the absence of misjudged collisions (false-positives). It includes in a compact and understandable manner the most relevant information about the traffic scenario for passive and active vehicle safety systems. This because features like false-positives and triggering thresholds are recognized easily.
The obtained results indicate that the algorithm is capable of detecting unavoidable collisions with an anticipation time that depends on the scenario that is being evaluated. When the traffic is purely longitudinal, this anticipation time is long enough to influence the vehicle dynamics. When cross-traffic is present, this anticipation time is long enough to activate passive safety systems. For the designed scenarios where static objects were placed in different arrangements in front of the EGO-vehicle, the average anticipation time was ms; and for the scenarios where the COs were in motion, it ranged from ms to ms. These results contrast with the ms that occur in a simple scenario without road modeling. The large anticipation times could aid a better triggering of multistage airbags, thus preventing passenger injuries [iihs2004evidence]. This is specially interesting for lateral airbags [iihs2003headprotecting]. Slow actuators could benefit from this too, since the extra time helps to compensate mechanical latencies.
One of the pillars of our algorithm is the combination of road modeling, vehicle motion models and the lateral dynamics controller. This ensures that the generated trajectories are not only drivable by vehicles, but that they are meaningful as well. Figure 7 shows a good comparison of this. For evaluation purposes, the controller module is disabled. This provokes that the trajectory generation does not take into consideration the drivable road, and that some of the trajectories tagged as “collision-free” actually go outside the available road infrastructure.
An extremely important result is that there are zero false-positive outputs for the evaluated scenarios. This demonstrates that the algorithm possesses a high degree of reliability and robustness. This is specially relevant when deciding triggering points of passive and active vehicle safety systems.
An additional benefit of the algorithm is the output of possible escape routes, that is, trajectories that could aid to avoid an oncoming collision. This derives from two factors. First, as stated in Section IV-B, all the trajectories are generated by the use of motion models combined with the aid of realistic acceleration profiles and a controller for lateral dynamics. This means that all generated and simulated trajectories can be driven by a vehicle. Second, as described in Section IV-D, the complete probability spectrum of the EGO-trajectories is known, so the most adequate one can be chosen. As stated in Section III, it is important to note that the resolution of the algorithm depends highly on the amount of acceleration profiles and paths (hypotheses).
Vi-B Execution Time
For evaluation, we use the Drive PX 2 development board from NVIDIA. The Drive PX 2 has a CPU with four ARM Cortex A57 cores and two NVIDIA Denver cores as well as a Tegra X2 GPU (GP10B) with 256 cores and a dedicated GPU (GP106) with 1152 cores, both based on the Pascal architecture. While the Tegra X2 GPU shares the main memory of the CPU, the dedicated GPU has its own memory.
We consider two scenarios: The first scenario (S1) considers 3 COs in addition to the EGO-vehicle while the second scenario (S2) considers 10 COs in addition to the EGO-vehicle. We use acceleration profiles, which are combined with and paths for CO and EGO-vehicle, respectively. For each scenario, we compute the trajectories for the next seconds with a resolution of ms, which equals to time steps. This results in million pose combinations that need to be evaluated per CO. In total, this results in million pose combinations for S1 and million pose combinations for S2. Using only acceleration profiles reduces the number of pose combinations to million for S1 and million for S2, respectively. The execution time for scenario S1 and S2 is shown in LABEL:tab:results. On the dedicated (GP106) / embedded (GP10B) GPU, it takes / ms to evaluate the proposed algorithm for S1 and / ms for S2. Considering only acceleration profiles reduces the execution time to / ms for S1 and / ms for S2. More than two thirds of the total execution time is spent for collision recognition. The GPU execution is more than two magnitudes faster than CPU execution, benefiting from hardware-accelerated trigonometrical functions. The communication overhead of the client/server architecture of Thrift adds ms on top of the algorithm execution time.
[mincapwidth=pos=h,label=tab:results,doinside=, caption=Performance results for the proposed algorithm. Shown is the number of pose combinations evaluated as well as the median execution time in ms (lower is better) on the CPU, dedicated GPU (GP106), and embedded GPU (GP10B).]lrrrr Scenario & # pose & CPU & GPU & GPU & combinations & & GP106 & GP10B S1 (3 CO) & 25.93 million & 1800 ms & 10 ms & 15 ms & 18.00 million & 1600 ms & 8 ms & 11 ms S2 (10 CO) & 86.44 million & 11600 ms & 21 ms & 49 ms & 60.01 million & 9600 ms & 14 ms & 35 ms
In this work, a fully model-based multi-modal parallelizable algorithm is presented. This algorithm is able to estimate upcoming criticality of complex traffic scenarios at very early stages. The architecture of the algorithm allows the further inclusion of road infrastructure and mobile objects. This architecture also allows the algorithm to be ported to different GPUs. The implementation on vehicle-compatible hardware proves in a prototypical manner its feasibility to function in production vehicles. Short execution times, deterministic results, and the absence of false-positives, prove the adequacy of the algorithm for passive and active vehicle safety systems.