1 Introduction
The recent development of deep learning has led to dramatic progress in multiple research fields, and this technique has naturally found applications in autonomous vehicles. The use of deep learning to perform perceptive tasks such as image segmentation has been widely researched in the last few years, and highly efficient neural network architectures are now available for such tasks. More recently, several teams have proposed taking deep learning a step further, by training socalled “endtoend” algorithms to directly output vehicle controls from raw sensor data (see, in particular, the seminal work in
[1]).Although endtoend driving is highly appealing, as it removes the need to design motion planning and control algorithms by hand, handing the safety of the car occupants to a software operating as a black box seems problematic. A possible workaround to this downside is to use “forensics” techniques that can, to a certain extent, help understand the behavior of deep neural networks [2].
We choose a different approach consisting in breaking down complexity by training simpler, monotask neural networks to solve specific problems arising in autonomous driving; we argue that the reduced complexity of individual tasks allows much easier testing and validation.
In this article, we focus on the problem of controlling a carlike vehicle in highly dynamic situations, for instance to perform evasive maneuvers in face of an obstacle. A particular challenge in such scenarios is the important coupling between longitudinal and lateral dynamics when nearing the vehicle’s handling limits, which requires highly detailed models to properly take into account [3]. However, precisely modeling this coupling involves complex nonlinear relations between state variables, and using the resulting model is usually too costly for realtime applications. For this reason, most references in the field of motion planning mainly focus on simpler models, such as pointmass or kinematic bicycle (single track), which are constrained to avoid highly coupled dynamics [4]. Similarly, research on automotive control usually treats the longitudinal and lateral dynamics separately in order to simplify the problem [5].
Although these simplifications can yield good results in standard onroad driving situations, they may be problematic for vehicle safety when driving near its handling limits, for instance at high speed or on slippery roads. To handle such situations, some authors have proposed using Model Predictive Control (MPC) with a simplified, coupled dynamic model [6] which is limited to extremely short time horizons (a few dozen milliseconds) to allow realtime computation. Other authors have proposed to model the coupling between longitudinal and lateral motions using the concept of “friction circle” [7], which allows precisely stabilizing a vehicle in circular drifts [8]. However, the transition towards the stabilized drifting phase – which is critical in the ability, e.g., to perform evasive maneuvers – remains problematic with this framework.
In this article, we propose to use deep neural networks to implicitly model highly coupled vehicular dynamics, and perform lowlevel control in realtime. In order to do so, we train a deep neural network to output lowlevel controls (wheels torque and steering angle) corresponding to a given initial vehicle state and target trajectory. Compared to classical MPC frameworks which require integrating dynamic equations online, this approach allows to perform this task offline and use only simple mathematical operations online, leading to much faster computations.
Several authors have already proposed a divideandconquer approach by using machine learning on specific subtasks instead of performing endtoend computations, and in particular on the case of motion planning and control. For instance, reference
[9] used a Convolutional Neural Network (CNN) to generate a cost function from input images, which is then used inside an MPC framework for highspeed autonomous driving; however, this approach has the same limitations as model predictive control. Other approaches, such as [10], used reinforcement learning to output steering controls for a vehicle, but were limited to lowspeed applications. Reference
[11]used a Rectified Linear Unit (ReLU) network model to identify the dynamics of a helicopter in order to predict its future accelerations, but this model has not been used for control.
Closer to our work, reference [12] trained neural networks integrating a priori knowledge of the bicycle model for decoupled longitudinal and lateral control of a vehicle; in [13]
, authors used supervised learning to generate lateral controls for truck and integrated a control barrier function to ensure the safety of the system. Reference
[14] coupled a standard control and an adaptive neural network to compensate for unknown perturbations in order to perform trajectory tracking for autonomous underwater vehicle. To the best of our knowledge, deep neural networks have not been used in the literature for the coupled control of wheeled vehicles.The rest of this article is organized as follows: Section 2 presents the vehicle model used to generate the training dataset and to simulate the vehicle dynamics on a test track. Section 3 introduces two artificial neural networks architectures used to generate the control signals for a given target trajectory, and describes the training procedure used in this article. Section 4 compares the performance of these two networks, using simulation on a challenging test track. A comparison to conventional decoupled controllers is also provided. Finally, Section 5 concludes this study.
2 THE 9 DoF VEHICLE MODEL
In this section, we present the 9 Degrees of Freedom (9 DoF) vehicle model which is used both to generate the training and testing dataset, and as a simulation model to evaluate the performance of the deeplearningbased controllers.
The Degrees of Freedom comprise 3 DoF for the vehicle’s motion in a plane (), 2 DoF for the carbody’s rotation () and 4 DoF for the rotational speed of each wheel (). The model takes into account both the coupling of longitudinal and lateral slips and the load transfer between tires. The control inputs of the model are the torques applied at each wheel and the steering angle of the front wheel. The lowlevel dynamics of the engine and brakes are not considered here. The notations are given in Table 1 and illustrated in Figure 1.
Remark: the subscript refers respectively to the front left (), front right (), rear left () and rear right () wheels.
Several assumptions were made for the model:

Only the front wheels are steerable.

The roll and pitch rotations happen around the center of gravity.

The aerodynamic force is applied at the height of the center of gravity. Therefore, it does not involve any moment on the vehicle.

The slope and roadbank angle of the road are not taken into account.
,  Position of the vehicle in the ground frame 
, ,  Roll, pitch and yaw angles of the carbody 
,  Longitudinal and lateral speed of the vehicle in its inertial frame 
Total mass of the vehicle  
, ,  Inertia of the vehicle around its roll, pitch and yaw axis 
Inertia of the wheel  
Total torque applied to the wheel  
,  Longitudinal and lateral tire forces generated by the road on the wheel expressed in the tire frame 
,  Longitudinal and lateral tire forces generated by the road on the wheel expressed in the vehicle frame 
Normal reaction forces on wheel  
,  Distance between the front (resp. rear) axle and the center of gravity 
Halftrack of the vehicle  
Height of the center of gravity  
Effective radius of the wheel  
Angular velocity of the wheel  
Longitudinal speed of the center of rotation of wheel expressed in the tire frame 
2.1 Vehicle dynamics
and denote respectively the longitudinal and the lateral tire forces expressed in the vehicle frame; denote the aerodynamic drag forces with the mass density of air, the aerodynamic drag coefficient and the frontal area of the vehicle; denote the damped mass/spring forces depending on the suspension travel due to the roll and pitch angles according to Equation (1f). The parameters and are respectively the stiffness and the damping coefficients of the suspensions.
(1f) 
2.2 Wheel dynamics
The dynamics of each wheel expressed in the pneumatic frame is given by Equation (2):
(2) 
2.3 Tire dynamics
The longitudinal force and the lateral force applied by the road on each tire and expressed in the pneumatic frame are functions of the longitudinal slip ratio , the sideslip angle , the normal reaction force and the road friction coefficient :
(3a)  
(3b) 
The longitudinal slip ratio of the wheel is defined as following:
(4) 
The lateral slipangle of tire is the angle between the direction given by the orientation of the wheel and the direction of the velocity of the wheel (see Figure 1):
(5) 
In order to model the functions and , we used the combined slip tire model presented by Pacejka in [15] (cf. Equations (4.E1) to (4.E67)) which takes into account the interaction between the longitudinal and lateral slips on the force generation. Therefore, the friction circle due to the laws of friction (see Equation (6)) is respected. Finally, the impact of load transfer between tires is also taken into account through .
(6) 
Lastly, the relationships between the tire forces expressed in the vehicle frame and and the ones expressed in the pneumatic frame and are given in Equation (7):
(7a)  
3 Deep Learning Models
We propose two different artificial neural network architectures to learn the inverse dynamics of a vehicle, in particular the coupled longitudinal and lateral dynamics. An artificial neural network is a network of simple functions called neurons. Each neuron computes an internal state (activation) depending on the input it receives and a set of trainable parameters, and returns an output depending on the input and the activation. Most neural networks are organized into groups of units called layers and arranged in a treelike structure, where the output of a layer is used as input for the following one. The training of the neural network consists in finding the set of parameters (weights and biases) minimizing the error (or
loss) between predicted and actual values on a training dataset. In this paper, this training dataset is computed using the 9 DoF vehicle model presented in Section 2.3.1 Dataset
The dataset generated by the 9DoF vehicle model has a total of 43241 instances: it is divided into a train set of 28539 instances and a test set of 14702 instances. The following procedure was used to generate each instance:
First, a control to apply is generated randomly, as well as an initial state of the vehicle. More precisely, the vehicle is chosen to be either in an acceleration phase or in a deceleration phase with equiprobability. In the first case, the torques at the front wheels and
are set equal to each other and drawn from a uniform distribution between
Nm and Nm, while the torques at the rear wheels and are set equal to zero (the vehicle is assumed to be a frontwheel drive one). In the second case, the torques of each wheel are set equal to each other and drawn from a uniform distribution between Nm and Nm. In both cases, the steering angle is drawn from a uniform distribution between and rad. The initial state is composed of the initial position of the vehicle in the ground frame, the longitudinal and lateral velocities and , the roll, pitch and yaw angles and their derivatives, and the rotational speed of the each wheels. The initial longitudinal speed is drawn from a uniform distribution between and m.s; the initial lateral speed is drawn from a uniform distribution whose parameters depend of ; the rotational speed is chosen such that the longitudinal slip ratio is zero. All the other initial states are set to zero.Secondly, the 9 DoF vehicle model is run for s, starting from the initial state and keeping the control constant during the whole simulation.
The resulting trajectories are downsampled to 301 timesteps, corresponding to a sampling time of ms.
Consequently, each instance of the dataset consists in: an initial state of the vehicle, a control kept constant over time, and the associated trajectory obtained . The dataset generation method is summarized in Algorithm 1.
3.2 Model 1: MultiLayer Perceptron
A MultiLayer Perceptron (MLP), or multilayer feedforward neural network, is a neural network whose equations are:
(8a)  
(8b) 
where
denotes the input vector,
the output of layer , the number of layers of the MLP and denotes theth activation function.
denotes the output vector of the neural network.The MLP, presented in Figure 2, is used to predict the constant control to apply given an initial state and a desired trajectory . It is trained on the dataset presented in subection 3.1. It comprises layers, respectively containing 32, 32, 128, 32 and 128 neurons. All the activations functions of the network are rectified linear units (ReLU):
. The loss function used, as well as weights initialization or regularization are discussed in the section
3.4, as they are common for the two neural networks proposed. We performed a grid search to choose the sizes of the layers amongpossibilities by allowing each layer to have a size of either 32, 64, or 128 neurons, training the corresponding MLP for 200 epochs and evaluating its performance on the test dataset.
3.3 Model 2: Convolutional Neural Network
Convolutional Neural Networks (CNN) are neural networks that use convolution in place of general matrix multiplication in at least one of their layers. A traditional CNN model almost always involves a sequence of convolution and pooling layers. CNNs have a proven history of being successful for processing data that has a known gridlike topology. For instance, numerous authors make use of CNNs for classification [17], or semantic segmentation [18] purposes.
We propose to use convolutions to preprocess the vehicle trajectory before feeding it to the MLP, as illustrated in Figure 3. Trajectories are timeseries data, which can be thought of as a 1D grid taking samples at regular time intervals, and thus are very good inputs to process with a CNN. We decided to process the X and Y coordinates separately. For each channel (either or ), we construct the following CNN module, which is depicted in Figure 4:
(9a)  
(9b) 
where is the output of the CNN module, the number of layers, the th activation function and the th pooling function.
The parameters of the CNN module are , with a convolution kernel size of 3 for all convolutions. The activation functions are all ReLU and the pooling functions are all averagepooling of size 2. The first two convolutions have 4 feature maps while the last convolution has only 1 feature map.
As the longitudinal and lateral dynamics are quite different, distinct sets of weights are used for the and convolutions. After processing the X and Y 1Dtrajectories by their dedicated CNN module, their output are concatenated. This new output is then fed to the former MLP whose characteristics remain the same except from the dimension of its input. The whole model shown in Figure 3 is designated as the “CNN model” in the rest of this work.
3.4 Training procedure
The training procedure is the same for the two neural networks:
3.4.1 Weights Initialization & Batching
Each training batch is composed of 32 instances of the dataset. The Xavier initialization [19] (also known as GLOROT uniform initialization) is used to set the initial random weights for all the weights of our model.
3.4.2 Loss function, Regularization & Optimizer
The objective of the training is to reduce the mean square error (MSE) between the controls predicted by the neural network and the ones that were really applied to obtain the given trajectory. The neural network is trained in order to minimize the loss function defined by Equation (10) on the train dataset, before evaluation on the test dataset.
(10) 
where
(11a)  
(11b)  
(11c) 
The scaling factors and were chosen in order to normalize the steering and the torques. The parameter was chosen in order to prioritize the lateral dynamics over the longitudinal one. Equation (11c) is an L2 regularization of our model, where is the vector containing all the weights of the network. We set .
To train our model, we used the Adam optimization algorithm [20]. It calculates an exponential moving average of the gradient and the squared gradient. For the decay rates of the moving averages, we used the parameters , . The values of other parameters were for the learning rate, and
to avoid singular values.
4 Results
In order to compare their ability to learn the vehicle dynamics, the two different artificial neural networks are used as “controllers”^{1}^{1}1Properly speaking, they are not real controllers as they to not learn how to reject disturbances and modeling errors.. The reference track, presented in Figure 5, comprises both long straight lines and narrow curves. The reference speed is set to m/s on the whole track.
4.1 Generating the control commands
In order to compute the control commands to be applied to the vehicle, the artificial neural network needs to know the trajectory the vehicle has to follow in the next s, as in the train dataset. One problem that arises is that it has only learned to follow trajectories starting from its actual position such as in Figure 11. However, in practice, the vehicle is almost never exactly on the reference path. Therefore, a path section starting from the actual position of the vehicle and bringing it back to the reference path is generated: for that purpose, cubic Bezier curves were chosen as illustrated in Figure 12. Thus, at each iteration, (i) a Bezier curve with length s is computed to link the actual position of the vehicle to the reference trajectory; (ii) a query comprising the previously computed Bezier curve is sent to the artificial neural network; (iii) the artificial neural network returns the torques at each wheel and the front steering angle to apply until the next control commands are obtained. The computation sequence is run every ms, even though the query takes less than ms.
4.2 Comparison of the models
The results obtained for the MLP and the CNN models are displayed respectively in blue and in red in Figures 6 to 10. The resulting videos, obtained using the software PreScan [21], are available online^{2}^{2}2https://www.youtube.com/watch?v=yyWy1uavlXs. Clearly, it appears that the results obtained using a CNN are better than a MLP. First, we observe that the control commands are smoother in curves with the CNN. There are steep steering (see Figure 6) and front torques (see Figure 7) variations for the MLP around m in road sections n and around m in road sections n. In the latter case, the steering angle reaches its saturation value rad and the wheel torques change suddently from Nm to Nm and viceversa, which is impossible in practice. On the contrary, the control signals of the CNN model remains always smooth and within a reasonable range of values. Secondly, both the longitudinal and lateral errors are smaller for the CNN than the MLP as shown respectively in Table 2 and 3.
model  RMS  average  std. dev.  max 

MLP  0.76  0.29  0.70  4.94 
CNN  0.60  0.39  0.46  2.33 
model  RMS  average  std. dev.  max 

MLP  0.61  0.003  0.61  3.26 
CNN  0.43  0.014  0.43  1.7 
However, unlike classic controllers, stability cannot be ensured for these “controllers” as they are black boxes. In particular, for the CNN, we observe a lateral static error in straight lines. This static error is caused in fact by the Bezier curves which do not converge fast enough to the reference track on straight lines as only the first ms are really followed by the CNN model (see Figure 13). Moreover, Figure 6 shows that the steering angle applied during straight lines is the same for MLP and CNN.
4.3 Coupling between longitudinal and lateral dynamics
The speed limit a kinematic bicycle model can reach in a curve of radius is given by Equation (12) where is the road friction coefficient and the gravity constant [4]. This corresponds to m/s (m) in road section n2 and m/s (m) in road section n6. As the reference speed is set to m/s throughout the track, conventional decoupled longitudinal and lateral controllers (based on a kinematic bicycle model) will not perform well in road section n6.
(12) 
On the contrary, both models (especially the CNN) are able to pass this road section, showing the ability of artificial neural networks to handle coupled longitudinal and lateral dynamics. More precisely, we observe in Figure 9 that the speed is reduced in section n because the artificial neural networks deliberately brake (see Figure 7 and 8), even though the speed of the vehicle is below the reference speed. This is due to the loss function used during training and given by Equation (10) that penalizes more steering angle errors than torque errors. Hence, the models prioritize the lateral over the longitudinal dynamics.
Therefore, such “controllers” are particularly interesting for highly dynamic maneuvers such as emergency situations or aggressive driving where the longitudinal and lateral dynamics are strongly coupled. However, they should be used sparingly as they are only black boxes, or should at least be supervised by modelbased systems. Moreover, for normal driving situations, conventional decoupled longitudinal and lateral controller should be preferred.
4.4 Comparison with decoupled controllers
Finally, the “controllers” obtained with the MLP and CNN models are compared with commonly used decoupled controllers: the lateral controller is either a purepursuit (PP) [22] or a Stanley [23] controller while in both cases, the longitudinal controller is ensured by a ProportionalIntegral (PI) controller with gains and . The gain for the front lateral error is for the Stanley controller. The preview distance of the purepursuit controller is defined as a function of the total speed at the center of gravity: where s is the anticipation time. The results of the PP and the Stanley controllers are shown respectively in green and grey in Figures 6 to 10. Clearly, a decrease of performance can be observed when using these decoupled controllers in the challenging part of the track. In particular, the lateral error becomes huge in both cases during the sharp turn of road section n while the CNN was able to perform reasonnably well.
5 Conclusions
This work presented some preliminary results on deep learning applied to trajectory tracking for autonomous vehicles. Two different approaches, namely a MLP and a CNN, were trained on a highfidelity vehicle dynamics model in order to compute simultaneously the torque to apply on each wheel and the front steering angle from a given reference trajectory. It turns out that the CNN model provides better results, both in terms of accuracy and smoothness of the control commands. Moreover, compared to most of the existing controllers, it is able to handle situations with strongly coupled longitudinal and lateral dynamics in a very short time. However, the controller obtained is a blackbox and should not be used in standalone.
The results proved the ability of deep learning algorithms to learn the vehicle dynamics characteristics. This opens a wide range of new possible applications of such techniques, for example for generating dynamically feasible trajectories. Future work will focus on (i) replacing the complex dynamics models by a learned offline model in Model Predictive Control for motion planning, (ii) using Generative Adversarial Networks (GAN) to generate safe trajectories where the learned dynamics is used as constraint, and (iii) performing realworld experiments with our approach on a real car.
References
 [1] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to End Learning for SelfDriving Cars,” arXiv:1604, pp. 1–9, apr 2016. [Online]. Available: http://arxiv.org/abs/1604.07316
 [2] D. Castelvecchi, “Can we open the black box of ai?” Nature News, vol. 538, no. 7623, p. 20, 2016.
 [3] T. D. Gillespie, “Vehicle dynamics,” Warren dale, 1997.
 [4] P. Polack, F. Altché, B. d’AndréaNovel, and A. de La Fortelle, “Guaranteeing Consistency in a Motion Planning and Control Architecture Using a Kinematic Bicycle Model,” in American Control Conference, Milwaukee, WI, UnitedStates, 2018 (accepted). [Online]. Available: https://arxiv.org/pdf/1804.08290.pdf
 [5] A. Khodayari, A. Ghaffari, S. Ameli, and J. Flahatgar, “A historical review on lateral and longitudinal control of autonomous vehicle motions,” ICMET 2010  2010 International Conference on Mechanical and Electrical Technology, Proceedings, no. Icmet, pp. 421–429, 2010.
 [6] P. Falcone, F. Borrelli, J. Asgari, H. E. Tseng, and D. Hrovat, “Predictive active steering control for autonomous vehicle systems,” IEEE Transactions on Control Systems Technology, vol. 15, no. 3, pp. 566–580, 2007.
 [7] K. Kritayakirana and J. C. Gerdes, “Autonomous vehicle control at the limits of handling,” International Journal of Vehicle Autonomous Systems, vol. 10, no. 4, p. 271, 2012.
 [8] J. Y. Goh and J. C. Gerdes, “Simultaneous stabilization and tracking of basic automobile drifting trajectories,” in 2016 IEEE Intelligent Vehicles Symposium (IV), no. Iv. IEEE, jun 2016, pp. 597–602. [Online]. Available: http://ieeexplore.ieee.org/document/7535448/
 [9] P. Drews, G. Williams, B. Goldfain, E. A. Theodorou, and J. M. Rehg, “Aggressive Deep Driving: Model Predictive Control with a CNN Cost Model,” Proceedings of the 1st Annual Conference on Robot Learning, vol. 78, no. CoRL, pp. 133–142, jul 2017.
 [10] SeYoung Oh, JeongHoon Lee, and DooHyun Choi, “A new reinforcement learning vehicle control architecture for visionbased road following,” IEEE Transactions on Vehicular Technology, vol. 49, no. 3, pp. 997–1005, may 2000. [Online]. Available: http://ieeexplore.ieee.org/document/845116/
 [11] A. Punjani and P. Abbeel, “Deep learning helicopter dynamics models,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), vol. 2015June, no. June. IEEE, may 2015, pp. 3223–3230. [Online]. Available: http://ieeexplore.ieee.org/document/7139643/
 [12] I. Rivals, D. Canas, L. Personnaz, and G. Dreyfus, “Modeling and control of mobile robots and intelligent vehicles by neural networks,” in Proceedings of the Intelligent Vehicles ’94 Symposium. IEEE, 1994, pp. 137–142. [Online]. Available: http://ieeexplore.ieee.org/document/639489/
 [13] Y. Chen, A. Hereid, H. Peng, and J. Grizzle, “Synthesis of safe controller via supervised learning for truck lateral control,” arXiv, no. December, pp. 1–13, dec 2017. [Online]. Available: http://arxiv.org/abs/1712.05506
 [14] R. Cui, C. Yang, Y. Li, and S. Sharma, “Adaptive Neural Network Control of AUVs With Control Input Nonlinearities Using Reinforcement Learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 47, no. 6, pp. 1019–1029, jun 2017. [Online]. Available: http://ieeexplore.ieee.org/document/7812772/
 [15] H. B. Pacejka, Tyre and Vehicle Dynamics. ButterworthHeinemann, 2002.
 [16] R. Rajamani, Vehicle Dynamics and Control. Springer, 2012.
 [17] Z. Dong, Y. Wu, M. Pei, and Y. Jia, “Vehicle type classification using a semisupervised convolutional neural network,” IEEE transactions on intelligent transportation systems, vol. 16, no. 4, pp. 2247–2256, 2015.
 [18] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoderdecoder architecture for image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 12, pp. 2481–2495, 2017.

[19]
X. Glorot and Y. Bengio, “Understanding the difficulty of training deep
feedforward neural networks,” in
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
, 2010, pp. 249–256.  [20] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [21] “Tass international,” http://www.tassinternational.com/prescan.
 [22] R. C. Coulter, Implementation of the Pure Pursuit Path Tracking Algorithm. Carnegie Mellon University, 1992.
 [23] S. Thrun, M. Montemerlo, H. Dahlkamp, D. Stavens, A. Aron, J. Diebel, P. Fong, J. Gale, M. Halpenny, G. Hoffmann, K. Lau, C. Oakley, M. Palatucci, V. Pratt, P. Stang, S. Strohband, C. Dupont, L.E. Jendrossek, C. Koelen, C. Markey, C. Rummel, J. van Niekerk, E. Jensen, P. Alessandrini, G. Bradski, B. Davies, S. Ettinger, A. Kaehler, A. Nefian, and P. Mahoney, “Stanley: The robot that won the darpa grand challenge: Research articles,” J. Robot. Syst., vol. 23, no. 9, pp. 661–692, Sept. 2006.
Comments
There are no comments yet.