Interaction-aware Kalman Neural Networks for Trajectory Prediction

02/28/2019 ∙ by Ce Ju, et al. ∙ University of Michigan KAIST 수리과학과 Nanyang Technological University Baidu, Inc. 0

Forecasting the motion of surrounding dynamic obstacles (vehicles, bicycles, pedestrians and etc.) benefits the on-road motion planning for autonomous vehicles. Complex traffic scenes yield great challenges in modeling the traffic patterns of surrounding dynamic obstacles. In this paper, we propose a multi-layer architecture Interaction-aware Kalman Neural Networks (IaKNN) which involves an interaction layer for resolving high-dimensional traffic environmental observations as interaction-aware accelerations, a motion layer for transforming the accelerations to interaction-aware trajectories, and a filter layer for estimating future trajectories with a Kalman filter. Experiments on the NGSIM dataset demonstrate that IaKNN outperforms the state-of-the-art methods in terms of effectiveness for trajectory prediction.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

One hardcore technique for autonomous vehicles is that of forecasting the motion of surrounding dynamic obstacles effectively since it benefits the on-road motion planning which is a core component in the control system. In fact, on the motion planning layer of the Apollo open platform [Fan et al.2018], on-road dynamic obstacles would become technically static when their motion has been predicted and the planning with static obstacles has been adequately solved [Gu et al.2013, McNaughton2011].

The problem of forecasting the motion of surrounding dynamic obstacles for autonomous driving has many real challenges, e.g., heavy noise in sensor data, complex traffic scenes and intractable interactive effects among the dynamic obstacles. Existing methods could be categorized into three classes, namely the physics-based motion model [Liu et al.2005], the maneuver-based motion model [Frazzoli et al.2005], and the interaction-aware motion model [Lefèvre et al.2014]. The physics-based motion model is based on the basic kinematic and dynamic models from physics (e.g., Newton’s Laws of Motion). The maneuver-based motion model is designed for a particular maneuver, where the future trajectory of a vehicle is predicted by searching the trajectories which have been clustered a priori. The interaction-aware motion model is one that captures the interactive effects among vehicle drivers in a traffic scene by predicting the trajectories of multiple vehicles that are close to one another collectively. Among those interaction-aware models, many adopt deep models [Alahi et al.2016, Gupta et al.2018, Deo and Trivedi2018a, Kuefler et al.2017, Bhattacharyya et al.2018, Ma et al.2018].

In this paper, we propose a model called Interaction-aware Kalman Neural Networks (IaKNN) for foresting the motion of surrounding dynamic obstacles effectively. Specifically, IaKNN is a multi-layer architecture consisting of three layers, namely an interaction layer, a motion layer and a filter layer. The interaction layer is a deep neural network with multiple convolution layers standing before the LSTM encoder-decoder architecture. Fed with the past trajectories of vehicles that are close to one another, this layer extracts the “accelerations” that capture not only those raw acceleration readings but also the interactive effects among vehicles in the form of social force (which is a measure of internal motivation of an individual in a social activity in sociology and has been used for studying the motion trajectories of pedestrians [Helbing and Molnar1995]). The extracted accelerations are called interaction-aware accelerations. The motion layer is similar to the existing physics-based motion model which transforms accelerations to trajectories by using kinematics models. Here, instead of feeding the motion layer with the accelerations read from sensors directly, we feed with those interaction-aware accelerations that are outputted by the interaction layer and call the transformed trajectories interaction-aware trajectories. The filter layer consists of mainly a Kalman filter for optimally estimating the future trajectories based on the interaction-aware trajectories outputted by the motion layer. The novelty in this layer is that we incorporate two LSTM neural networks [Hochreiter and Schmidhuber1997] for learning the time-varying process and measurement noises that would be used in the update step of the Kalman filter, and this is the first of its kind for trajectory prediction. In summary, our IaKNN model enjoys the merits of both the physics-based model (the motion layer) and the interaction-based model (the interaction layer) and employs neural-network-based probabilistic filtering for accurate estimation (the filter layer).

In experiments, we evaluate IaKNN on the Next Generation Simulation (NGSIM) dataset [Colyar and Halkias2007] and the empirical results demonstrate the effectiveness of IaKNN. In summary, the major contributions of this paper are listed as follows:

  • [leftmargin=10pt]

  • We propose to capture the interactive effects among vehicles with interaction-aware accelerations and then use them in kinematics models for trajectory prediction.

  • We propose to learn the time-varying process and measurement noises with LSTM neural networks in a Kalman filter, which, to the best of our knowledge, is the first neural network-based probabilistic filtering algorithm for real-time trajectory prediction.

  • We perform extensive experiments on the NGSIM dataset, which show that IaKNN consistently outperforms the state-of-the-art methods in terms of effectiveness.

2 Related Work

2.0.1 State Estimation in Robotics

State estimation is one of the common techniques in robotics to estimate the state of a robot from various measurements which involve nosies. Classic approaches of state estimation can be found in [Barfoot2017]

. Nowadays, artificial intelligence approaches have been explored largely for state estimation in robotics. For example, Coskun et al. train a triple-LSTM neural network architecture to learn the kinematic motion model, process noise, and measurement noise in order to estimate human pose in a camera image

[Coskun et al.2017]. Haarnoja et al. propose a discriminative approach for state estimation in which they train a neural network to learn features from highly complex observations and then filter the learned features to output underlying states [Haarnoja et al.2016]. Our IaKNN model has a different goal from those models, i.e., IaKNN aims to predict the trajectories of vehicles.

2.0.2 Data-driven Approach in Trajectory Prediction

Trajectory prediction for a smart vehicle, which is an important task for autonomous driving, has been largely studied [Lefèvre et al.2014]

. Among those methods for this task, the data-driven ones are dominating. For example, Alahi et al. propose a deep learning model to predict the motion dynamics of pedestrians in a crowded scene in which they build a fully connected layer called social pooling to learn the social tensor based on pedestrians

[Alahi et al.2016]. Gupta et al. propose a GAN-based encoder-decoder framework for trajectory prediction with a pooling mechanism to aggregate information across people [Gupta et al.2018]. In [Deo and Trivedi2018a], the authors extract a social tensor with a convolutional social pooling layer and then feed the social tensor to a maneuver-based motion model for trajectory prediction. Kuefler et al. [Kuefler et al.2017] and Bhattacharyya et al. [Bhattacharyya et al.2018]

use imitation learning approach to learn human drivers’ behaviors for trajectory prediction. The learned policies are able to generate the future driving trajectories that match those of human drivers better and can also interact with neighboring vehicles in a more stable manner over long horizons. Ma et al. propose an LSTM-based two-layers model TrafficPredict for heterogeneous traffic-agents in an urban environment

[Ma et al.2018]. Our IaKNN model differs from these models in two aspects. First, IaKNN captures the interactive effects in a form of accelerations which could then be feed to kinematics models and thus it enjoys the merits of both the classic Physics models and the data-driven process (of capturing the interactive effects). Second, IaKNN employs the Kalman filter for optimizing the state estimation, where LSTM neural networks are used for learning the time-varying process and measurement noises that are used in the Kalman model, and this is the first of its kind for trajectory prediction.

3 Traffic Datasets

There are four datasets, namely Cityscapes [Cordts et al.2016], KITTI [Geiger et al.2013], ApolloScape [Huang et al.2018], and NGSIM [Colyar and Halkias2007], which are publicly available and involve traffic scenes. The first three were collected from the first person perspective and have been widely used for single-agent systems in robotics. The fourth one, NGSIM, was collected on the southbound US101 road and the eastbound I-80 road with a software application called NG-VIDEO which transcribes vehicles’ trajectories from an overhead video. In this work, we use NGSIM since among the four datasets, NGSIM is only one that is suitable for the study of a multi-agent system which we target in this paper.

Figure 1: Illustration of the IaKNN Model: In the diagram, at timestamp , the environmental observation flows into the interaction layer which generates the interaction-aware acceleration . Then, we calculate the interaction-aware trajectory of vehicles w.r.t Vehicle Dynamic Model (VDM) in motion layer. In the end, time-varying multi-agent Kalman neural network runs over the predicted time horizon to fuse dynamic trajectory and interaction-aware trajectory . Particularly, the time-varying process and measurement noises in the filter layer are set by zero-mean Gaussian noises with covariance formulated in a gated-structure neural network.

4 Kalman Filter

In this part, we provide some background of the Kalman filter (KF) [Bishop et al.2001] which shall be used as a building block in our model introduced in this paper. KF is an optimal state estimator in the mean square error (MSE) sense with a linear (dynamic) model and Gaussian noise assumptions. Suppose the state, control, and observation of the linear model are , and , respectively. The model could be expressed with a process equation and a measurement equation as follows.

where is a dynamic matrix, is a control matrix, is an observation matrix, which are all known. Moreover, is the process noise and is the measurement noise based on the noise covariance matrices and , respectively.

The process of KF is as follows. It iterates between a prediction phase and an update phase for each of the observations. In the prediction phase, the current state and the error covariance matrix are estimated as follows.

In the update phase, once the current observation is received, the Kalman gain , the prior estimation and the error covariance matrix are calculated as follows.

For a comprehensive review of KF, the readers could refer to standard references [Bishop et al.2001].

5 Problem Statement

We assume there are vehicles in the multi-agent system (traffic scene). For each vehicle, we collect at each timestamp its position , velocity , acceleration , width , length , and relative distances from other agents. We call the observations of all vehicles at timestamp the environmental observation at timestamp and denote them by . Given the -length past environmental observations , the problem is to predict for each vehicle its future -length trajectories.

6 Methodology

In this section, we present our architecture interaction-aware Kalman neural networks (IaKNN). Figure 1 gives an overview of the architecture, where the notations are explained as follows. is the portfolio of interaction-aware accelerations outputted by the interaction layer. is the portfolio of interaction-aware trajectories computed by the motion layer, and and are the state and the control of the Kalman filter in the filter layer, respectively, where and are the noise covariance matrices, both learned by LSTM neural networks. Besides, in this paper, , , and represent the starting time, current time, observation time horizon and prediction time horizon, respectively, where .

In the following, we present three layers of IaKNN, namely the interaction layer, the motion layer and the filter layer, in Section 6.1, Section 6.2, and Section 6.3, respectively.

6.1 Interaction Layer

In the interaction layer, we aim to extract the interaction-aware accelerations from the past traffic environment observations .

6.1.1 Interaction-aware Acceleration

Normally, the motion of a vehicle would be determined by its own accelerations. Nevertheless, in a multi-agent system which we target in this paper, the situation is much more complex since drivers of vehicles would be affected by those of other vehicles that are nearby (or they would interact with one another). For example, a vehicle would be forced to slow down if another vehicle nearby tries to cut the lane in the front. In fact, the motion of vehicles is determined by not only their physical accelerations but also the interactive effects among vehicles. Inspired by the classical social force model [Helbing et al.2000], which models the intention of a driver to avoid colliding with dynamic or static obstacles, we propose to extract those accelerations such they capture both the raw accelerations recorded and the interactions among vehicles nearby. We call them the interaction-aware accelerations and denote them by .

Specifically, at timestamp , traffic environment observations includes a sequence of recorded accelerations , widths , lengths , and relative distances of agents in the system. By following [Helbing and Molnar1995], we compute the so-called repulsive interaction forces , where superscripts and represent two vehicles that are close to each other and include them in . Thus, the interaction operator formula at timestamp is written in details as,

The interaction layer is implemented as a neural network as presented in Figure 2.

6.1.2 Operator Representation

At timestamp , the interaction layer in an operator formula is written as,

where is a portfolio of past environmental observations from to and is the portfolio of the interation-aware accelerations from to .

Figure 2: Illustration of Interaction Layer

: The architecture of interaction layer is an LSTM encoder-decoder. In the encoder, we build convolutional layers (CNN) regarded as a social tensor extractor, fully-connected layers (FCN) regarded as a mixer of the social features, and merge the deep features into the encoder LSTM. In the decoder, the decoder LSTM outputs the predicted accelerations.

6.2 Motion Layer

In the motion layer, we aim to calculate the interaction-aware trajectories based on the interaction-aware accelerations from the interaction layer.

The main intuition of the motion layer comes from the primary kinematic equation which establishes a relationship among position, time and velocity by Newtonian physics. Our strategy is to use higher-order derivatives of a position for better forecasting. Specifically, let be the position of a dynamic obstacle at timestamp . We write with the Taylor expansion as follows.

(1)

where represents the velocity at timestamp , represents the acceleration at timestamp , and the Big-O term captures all remaining terms which would be ignored. Moreover, we replace the acceleration term with an interaction-aware acceleration which is derived from the environment observations.

We specify the velocity term in Equation 1 as follows. Suppose the current timestamp is . For , we take the velocity readings which are currently available and transform them to by using a dynamic model - depending on the agent type, we adopt different dynamic models for this task, which shall be introduced shortly. For , we estimate their values by applying an integral function based on the interaction-aware accelerations as follows.

where .

Next, we introduce the dynamic models for transforming the velocity readings which are based on a vehicle-centric coordinating system to those based on a global coordinating system such that they could be plotted in the Equation 1. Depending on the agent type, we introduce two dynamic models, namely the Vehicle Dynamic Model (VDM) and the Pedestrian Dynamic Model (PDM). As the names imply, the former is for the case where vehicles are agents and the latter is for pedestrians are agents.

6.2.1 Vehicle Dynamic Model (VDM)

By following [Pepy et al.2006], we implement the vehicle dynamic model as a classical bicycle model [Taheri1992]. Specifically, suppose is the current reading involving velocities, where and are the coordinates, is the orientation, and are velocities, and is the yaw rate. The bicycle model is written as follows.

and are the transformed velocities along the and directions, respectively. For more details, the readers are referred to [Pepy et al.2006].

6.2.2 Pedestrian Dynamic Model (PDM)

Models for predicting pedestrian dynamic have been explored largely [Kooij et al.2014, Zhou et al.2012, Scovanner and Tappen2009], and any of these models could be applied here.
To simplify this layer, we regard all agents as mass points and their motion behaviors are described in the basic kinematic motion equations and .

6.2.3 Operator Representation

At timestamp , the motion layer in an operator formula is written as,

where is an interaction-aware trajectory from to .

6.3 Filter Layer

In the filter layer, we establish a model based on the Kalman filter to estimate the dynamic trajectories based on the interaction-aware trajectories used as observations.

6.3.1 Filter Model

To fit the Kalman filter as described in Section 4, we let the dynamic trajectories be the states, the interaction-aware trajectories be the observations and the dynamic accelerations be the controls in a linear model. As a result, the equation of the linear dynamic model could be written as follows.

(2)
(3)

where is the state transition matrix, is the control matrix, are the time-varying process noises and are the measurement noises. The time-varying covariances and will be learned by time-varying noise models which consist of LSTM neural networks and will be introduced later. Note that here we assume the observation matrix

is an identity matrix for simplicity.

6.3.2 Specifications of the Layer

Notice that we can always assume agents (dynamic obstacles) in the multi-agent system. At timestamp , the state and the observation of our Equation 2 and 3 could be written as follows.

where the state includes positions from GPS and velocities from the wheel odometer and the observation includes the predicted positions and the predicted velocities , where . Specifically, we have the following.

where the subscript is the predicted time horizon.

Next, we define the state transition matrix and the control matrix in our model. Firstly, we define two matrix blocks and as follows.

and

where is the time difference between two adjacent traffic environment observations. Then, and are block diagonal matrices that are defined as follows.

6.3.3 Prediction and Updated Steps

The prediction step of the Kalman filter is defined as,

and the update step is as,

where and are the outputs of the time-varying noise models that we introduce next.

6.3.4 Time-varying Noise Model

Since our desired filter model is time-varying, we assume both the process noises and the measurement noises to follow a zero-mean Gaussian noise model with its covariances formulated as and , respectively.

6.3.5 Operator Representation

At timestamp , the filter layer in an operator formula is written as,

where is the predicted trajectory from to .

6.4 Loss Function

The loss function of architecture IaKNN (

Interaction, Motion, and Filter) is defined as the sum of displacement error of prior estimation of dynamic trajectories and ground truth over all time steps and agents, as follows.

where means the ground truth of the future trajectory of -th agent at the start timestamp . Noth that is the observation time horizon as defined in the above.

7 Experiments

7.1 Dataset

We evaluate our approach IaKNN on two public datasets, namely US Highway 101 (US-101) and Interstate 80 (I-80) of the NGSIM program. Each dataset contains -coordinates of vehicle trajectories in a real highway traffic with 10Hz sampling frequency over a 45-min time span. The 45-min dataset consists of three 15-min segments of mild, moderate and congested traffic conditions. We follow the experimental settings that were proposed by existing studies [Deo and Trivedi2018b, Deo and Trivedi2018a] and combine US-101 and I-80 into one dataset. As a result, the dataset involves 100,000 frames of raw data.

We construct the multi-agent training traffic scene in the following construction procedure. Firstly, we align the raw data by its timestamps. Secondly, we form a multi-agent traffic scene by picking one host vehicle and including five closest vehicles on its traffic lane or two adjacent traffic lanes. Finally, we set the window size for extraction as 7 seconds to generate the training scenes.

7.2 Evaluation Metrics

Two metrics, namely the root-mean-square error (RMSE) and negative log-likelihood (NLL), are used to measure the effectiveness of IaKNN. In particular, the first 2-seconds trajectories and the rest 5-seconds trajectories are used as past trajectories and the ground truth in a 7-seconds multi-agent training traffic scene, respectively.

  • [leftmargin=10pt]

  • RMSE: the root mean squared sum accumulated by the displacement errors over the predicted positions and real positions during the prediction time horizon.

  • NLL: the sum of the negative log probabilities of the predicated positions against the ground-truth positions during the prediction time horizon (we consider a predicted position to be correct if its distance from the ground-truth one is bounded by a small threshold and wrong otherwise).

7.3 Baselines

The following baseline models will be compared with our model IaKNN.

  • [leftmargin=10pt]

  • Constant Velocity (CV): Model of the primary kinematic equation with constant velocity.

  • Vanilla-LSTM (V-LSTM): Model of Seq2Seq. It is from a sequence of past trajectories to a sequence of future trajectories [Park et al.2018].

  • Social LSTM (S-LSTM): Model of LSTM-based neural network with a social pooling for pedestrian trajectory prediction. As demonstrated in [Alahi et al.2016], the model performs consistently better than traditional models such as the linear model, the collision avoidance model and the social force model. Therefore, we do not compare these traditional methods in our experiments.

  • Convolutional Social Pooling-LSTM (CS-LSTM): Maneuver based motion model which will generate a multi-modal predictive distribution [Deo and Trivedi2018a].

  • IaKNN-NoFL: The proposed method IaKNN without the filter layer.

7.4 Implementation Details

The default length of the past trajectories is two seconds and the time horizon of the predicted trajectories is one to five seconds. The default number of hidden units in LSTMs in the interaction layer and filter layer is set to 32 and all LSTM weight matrices are initialized using a uniform distribution over

. The weight matrices for other layers are set with the Xavier initialization. The biases are initialized to zeros. Additionally, in the training process, we adopt the Adam stochastic gradient descent with hyper-parameters

, and set the initial learning rate to be 0.001. In order to avoid the gradient vanishing, a maximum gradient norm constraint is set to 5. For the parameters of baselines, we follow the original settings in the open source code. The experiments are conducted on a machine with Intel(R) Xeon(R) CPU E5-1620 and one Nvidia GeForce GTX 1070 GPU.

Figure 3: The RMSE and NLL of CV, V-LSTM, S-LSTM, CS-LSTM, IaKNN-NoFL, and IaKNN

. For each evaluation metric, we plot its average for the prediction time horizon from 1s to 5s.

7.5 Result Analysis

The performance results of baseline methods and our method on the traffic scene are shown in Figure 3. We compute the RMSE and NLL for all traffic scenes and plot the average for the prediction time horizon from 1s to 5s. Clearly, the naive CV produces the highest prediction errors. V-LSTM, S-LSTM and CS-LSTM perform similarly in terms of both metrics which is mainly because they are all LSTM-based neural networks. Additionally, S-LSTM, CS-LSTM, and IaKNN-NoFL perform better than V-LSTM, especially in RMSE, and this is mainly because they take into account the interactive effects for modeling. IaKNN outperforms all other baseline methods in terms of both metrics. Specifically, we observe that it outperforms the best baseline CS-LSTM with about 20% improvement. This may be explained by the fact that the filter layer in our IaKNN model estimates the underlying trajectories from both the interaction-aware trajectories and the dynamic trajectories in a traffic scene and the interaction layer has done a good job in capturing the interactive effects among the surrounding vehicles. The combination of the deep neural network and probabilistic filter makes our model more applicable for the real-time trajectory prediction in the traffic scene.

7.6 Case Studies

We illustrate the results by showing the predicted trajectories generated by IaKNN and the real ones in the two lane-change traffic scenarios in Figure 4. The results demonstrate the effectiveness of IaKNN. Clearly, we observe that in general the predicted trajectories are very close to the real ones in the figure. Moreover, we notice that due to the interactive effects between vehicles in the scenario, some vehicles have a strong intention to increase their safe distances. The predicted trajectories are more prone to confirm their intentions.

Figure 4: Case studies of the prediction result by IaKNN: The predicted trajectories and the real ones are drawn in blue and green color, respectively. For each vehicle, we plot its future 2s trajectory.

8 Conclusion and Future Work

In this study, we propose an architecture, IaKNN, to forecast the motion of surrounding dynamic obstacles, in which we make the first attempt to generate a tractable quantity from complex traffic scene yielding a new interaction-aware motion model. Extensive experiments show that IaKNN outperforms the baseline models in terms of effectiveness on the NGSIM dataset. Further work will be carried out to extend IaKNN to a probabilistic formulation and combine IaKNN with a maneuver-based model in which road topology and more of the traffic information are taken into account a priori.

References

  • [Alahi et al.2016] Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. Social lstm: Human trajectory prediction in crowded spaces. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    , pages 961–971, 2016.
  • [Barfoot2017] Timothy D Barfoot. State Estimation for Robotics. Cambridge University Press, 2017.
  • [Bhattacharyya et al.2018] Raunak P Bhattacharyya, Derek J Phillips, Blake Wulfe, Jeremy Morton, Alex Kuefler, and Mykel J Kochenderfer. Multi-agent imitation learning for driving simulation. arXiv preprint arXiv:1803.01044, 2018.
  • [Bishop et al.2001] Gary Bishop, Greg Welch, et al. An introduction to the kalman filter. Proc of SIGGRAPH, Course, 8(27599-3175):59, 2001.
  • [Colyar and Halkias2007] James Colyar and John Halkias. Us highway 101 dataset. Federal Highway Administration (FHWA), Tech. Rep. FHWA-HRT-07-030, 2007.
  • [Cordts et al.2016] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele.

    The cityscapes dataset for semantic urban scene understanding.

    In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
  • [Coskun et al.2017] Huseyin Coskun, Felix Achilles, Robert S DiPietro, Nassir Navab, and Federico Tombari. Long short-term memory kalman filters: Recurrent neural estimators for pose regularization. In ICCV, pages 5525–5533, 2017.
  • [Deo and Trivedi2018a] Nachiket Deo and Mohan M Trivedi. Convolutional social pooling for vehicle trajectory prediction. arXiv preprint arXiv:1805.06771, 2018.
  • [Deo and Trivedi2018b] Nachiket Deo and Mohan M Trivedi. Multi-modal trajectory prediction of surrounding vehicles with maneuver based lstms. In 2018 IEEE Intelligent Vehicles Symposium (IV), pages 1179–1184. IEEE, 2018.
  • [Fan et al.2018] Haoyang Fan, Fan Zhu, Changchun Liu, Liangliang Zhang, Li Zhuang, Dong Li, Weicheng Zhu, Jiangtao Hu, Hongye Li, and Qi Kong. Baidu apollo em motion planner. arXiv preprint arXiv:1807.08048, 2018.
  • [Frazzoli et al.2005] Emilio Frazzoli, Munther A Dahleh, and Eric Feron. Maneuver-based motion planning for nonlinear systems with symmetries. IEEE transactions on robotics, 21(6):1077–1091, 2005.
  • [Geiger et al.2013] Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
  • [Gu et al.2013] Tianyu Gu, Jarrod Snider, John M Dolan, and Jin-woo Lee. Focused trajectory planning for autonomous on-road driving. In Intelligent Vehicles Symposium (IV), 2013 IEEE, pages 547–552. IEEE, 2013.
  • [Gupta et al.2018] Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. Social gan: Socially acceptable trajectories with generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), number CONF, 2018.
  • [Haarnoja et al.2016] Tuomas Haarnoja, Anurag Ajay, Sergey Levine, and Pieter Abbeel. Backprop kf: Learning discriminative deterministic state estimators. In Advances in Neural Information Processing Systems, pages 4376–4384, 2016.
  • [Helbing and Molnar1995] Dirk Helbing and Peter Molnar. Social force model for pedestrian dynamics. Physical review E, 51(5):4282, 1995.
  • [Helbing et al.2000] Dirk Helbing, Illés Farkas, and Tamas Vicsek. Simulating dynamical features of escape panic. Nature, 407(6803):487, 2000.
  • [Hochreiter and Schmidhuber1997] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  • [Huang et al.2018] Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, Peng Wang, Yuanqing Lin, and Ruigang Yang. The apolloscape dataset for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 954–960, 2018.
  • [Kooij et al.2014] Julian FP Kooij, Nicolas Schneider, and Dariu M Gavrila. Analysis of pedestrian dynamics from a vehicle perspective. In 2014 IEEE Intelligent Vehicles Symposium Proceedings, pages 1445–1450. IEEE, 2014.
  • [Kuefler et al.2017] Alex Kuefler, Jeremy Morton, Tim Wheeler, and Mykel Kochenderfer. Imitating driver behavior with generative adversarial networks. In Intelligent Vehicles Symposium (IV), 2017 IEEE, pages 204–211. IEEE, 2017.
  • [Lefèvre et al.2014] Stéphanie Lefèvre, Dizan Vasquez, and Christian Laugier. A survey on motion prediction and risk assessment for intelligent vehicles. Robomech Journal, 1(1):1, 2014.
  • [Liu et al.2005] C Karen Liu, Aaron Hertzmann, and Zoran Popović. Learning physics-based motion style with nonlinear inverse optimization. In ACM Transactions on Graphics (TOG), volume 24, pages 1071–1081. ACM, 2005.
  • [Ma et al.2018] Yuexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wenping Wang, and Dinesh Manocha. Trafficpredict: Trajectory prediction for heterogeneous traffic-agents. arXiv preprint arXiv:1811.02146, 2018.
  • [McNaughton2011] Matthew McNaughton. Parallel algorithms for real-time motion planning. 2011.
  • [Park et al.2018] SeongHyeon Park, ByeongDo Kim, Chang Mook Kang, Chung Choo Chung, and Jun Won Choi. Sequence-to-sequence prediction of vehicle trajectory via lstm encoder-decoder architecture. arXiv preprint arXiv:1802.06338, 2018.
  • [Pepy et al.2006] Romain Pepy, Alain Lambert, and Hugues Mounier. Path planning using a dynamic vehicle model. In Information and Communication Technologies, 2006. ICTTA’06. 2nd, volume 1, pages 781–786. IEEE, 2006.
  • [Scovanner and Tappen2009] Paul Scovanner and Marshall F Tappen. Learning pedestrian dynamics from the real world. In 2009 IEEE 12th International Conference on Computer Vision, pages 381–388. IEEE, 2009.
  • [Taheri1992] Saied Taheri. An investigation and design of slip control braking systems integrated with four-wheel steering. 1992.
  • [Zhou et al.2012] Bolei Zhou, Xiaogang Wang, and Xiaoou Tang. Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2871–2878. IEEE, 2012.