1 Introduction
In recent years, the amount of learningbased applications has increased immensely. Particularly, in the autonomous driving field, we have witnessed new approaches as the endend driving where the goal consists on guiding the vehicle by means of using learning algorithms and input sensors data. In Sallab et al. (2017)
, a deep reinforcement learning framework is proposed that takes raw sensor measurements as inputs and outputs driving actions. In
Bojarski et al. (2016), authors use a Convolutional Neural Network (CNN) to obtain the appropriate steering signal from the images of a single front camera.
Nowadays, from a control point of view, several strategies are starting to use learning tools to improve their capabilities while guaranteeing overall system stability. We have witnessed an advance in this field reaching learning techniques to adjust controllers, identify some parameters inside models or even control nonlinear systems. In Lefèvre et al. (2015a, b), some solutions for controlling the longitudinal velocity of a car based on learning human behaviour are presented. Also, a Model Predictive Control (MPC) technique using deep CNN to predict cost functions from the camera input images is developed in Drews et al. (2017). In Rosolia and Borrelli (2017); Rosolia et al. (2017); Rosolia and Borrelli (2019), the authors propose a referencefree iterative MPC strategy able to learn from previous laps information.
Most of the last approaches were based on learning some policies to drive the vehicle independently of a physical model. In this work, we are interested on learning a realistic and accurate representation of the system dynamics to improve the control performance. In Kabzan et al. (2019), authors use a simple starting vehicle model which is enhanced online by learning the model error using Gaussian process regression and measured vehicle data.
In paper, we propose the use of ANFIS approach, that is an adaptive neurofuzzy inference system, to learn the vehicle model. In the same manner as artificial neural networks, it works as a universal approximator (Jang, 1993). The main purpose of using ANFIS is to learn an inputoutput mapping based on input data. This tool has been widely used in other engineering fields (Ndiaye et al., 2018; Jaleel and Aparna, 2019).
The main contribution of this work is to model accurately a nonlinear system as a structured TakagiSugeno (TS) representation of the vehicle by means of using machine learning tools and input data. In particular, this paper takes advantage of the properties of ANFIS tool to learn a datadriven TS system which will be later used by a predictive optimal control to solve the driving problem.
The paper is structured as follows. Section 2 presents the testing vehicle used in simulation. Section 3 details the proposed learningbased method and its main components. Section 4 formulates the control and estimation problems. Section 5 introduces its application to a case study to assess the methodology, as well as various performance results. Finally, Section 6 presents several conclusions about the method suitability.
2 Testing vehicle
The Berkeley Autonomous Race Car (Gonzales et al., 2016) (BARC^{1}^{1}1http://www.barcproject.com/) is a development platform for autonomous driving to achieve complex maneuvers. This is a 1/10 scale RWD electric remote control (RC) vehicle (see Figure 1) that has been modified to operate autonomously. Mechanically speaking, this has been modified with some decks to protect the onboard electronics and sensors.
The nonlinear model used in this chapter for simulating the BARC vehicle is presented as
(1) 
where the dynamic vehicle variables , and represent the body frame velocities, i.e. linear in , linear in and angular velocities, respectively. The control variables and
are the steering angle at the front wheels and the longitudinal acceleration vector on the rear wheels, respectively.
and are the lateral forces produced in front and rear tires, respectively. This considers the simplified ”Magic Formula” model for simulating lateral tire forces where the parameters , and define the shape of the curve. Front and rear slip angles are represented as and , respectively. and represent the vehicle mass and inertia and and are the distances from the vehicle center of mass to the front and rear wheel axes, respectively. and are the static friction coefficient and the gravity constant, respectively. All the dynamic vehicle parameters are properly defined in Table 1.Parameter  Value  Parameter  Value 

0.125  0.125  
1.98  0.03  
7.76  1.6  
6.0  0.1 
In this work, with the aim of improving the simulation, Gaussian noise has been introduced in the measured variables as
(2) 
where is the signal covariance.
3 Learning the TS model
In this section, we present the modeling methodology used for obtaining the TS representation of the vehicle dynamic model. The tool ANFIS (Jang, 1993), is an adaptive neurofuzzy inference machine that is used for learning a particular structure from inputoutput data. More in detail, this modeling tool configures a neural network that learns from IO data the dynamic behaviour of the vehicle using back propagation technique and also employing the Recursive Least Squares (RLS) method for adjusting additional parameters.
The methodology consists on providing a dataset to the modeling algorithm (ANFIS). This is composed by vehicle states and inputs that represents a set of particular driving maneuvers guaranteeing rich enough data. Then, after a learningbased procedure, this provides a set of linear parameters, also known as consequent parameters, and a set of premise parameters or nonlinear parameters that define the set of membership functions (MF) that provide the nonlinear relationships between the different linear polynomials. One typical membership function is the generalized Gaussian function.
However, obtaining the TS representation of a system by means of using this resulting parameters is not trivial. The procedure is based on performing some inverse steps that ANFIS internally does. To address this problem we have to follow a set of reformulating steps. First, due to ANFIS algorithm can be only used for Multi Input Single Output (MISO) systems where just an output variable can be considered. Then, we split the system in MISO subsystems obtaining as many subsystems as state variables have the system. Our dynamic vehicle model is a third order system so three subsystems will be obtained and three learning procedures will be carried out. Once the algorithm has computed conveniently the consequent and premise parameters for each one of the MISO subsystems, we build the polytopic TS statespace representation for each one of these subsystems. To do this, first, the polynomial representation of each subsystem is formulated as
(3)  
where stands for a linear polynomial representation of the dynamics of a subsystem at a particular states configuration, , are the consequent parameters obtained from ANFIS where stands for the number of scheduling variables (See Figure 2) and represents the number of polytopic vertexes.
Then, simply by reorganising the terms in (3) as
(4) 
the polynomial structure is transformed into the discretetime statespace representation given by
(5) 
where, in order to easy the comprehension from a control point of view, is represented as the subsystem variable at the next discrete step () with symbol representing the discrete step. , and define the socalled vertex systems, and .
At this point, we use the obtained premise parameters to formulate the membership function. One of the most used is the generalized Gaussian Bell function (). This is defined by three parameters ( and ) as follows
(6)  
where represents the ANFIS input vector of variables (from now on we will refer to them as scheduling variables) and and represent the number of MF per scheduling variable and the number of scheduling variables, respectively. For a common case where is two, then, the normalized weights () are computed following
(7) 
where corresponds to any of the weighting functions that depend on each rule . Then, using
(8) 
the normalized weights are obtained. Note that, each scheduling variable is known and varies in a defined interval . Finally, the polytopic TS model for each subsystem is represented as
(9) 
where is the number of subsystems.
4 TS Control and Estimation
In this section, we present the formulations for the MPC and MHE techniques using the TS formulation.
4.1 TSMPC Design
Computing the predicting states behaviour in a certain horizon when using a system dependent on some scheduling variables (TS system) can be a challenging task sometimes leading to errors in the instantiation since the real future behaviour is unknown.
In this work, we propose the use of data coming from two different locations to approximate in a better way the predictive instantiation and avoid a lack of convergence in the optimal procedure. On the one hand, data coming from a planner is used which represents the desired states behaviour for tracking the desired trajectory. On the other hand, predicted states from the past optimal realisation are also used to improve the TS model instantiation.
The model used in this section is the one presented in (11) where the vector of scheduling variables is defined as . The use of this model allows to formulate the MPC problem as a quadratic optimization problem that is solved at each time to determine the control actions considering that the values of and are known
(12)  
s.t.  
where and constraint the system inputs and their variations, respectively.
State vector is , is the estimated state vector, is the reference vector provided by the trajectory planner, is the control input vector and is the prediction horizon. The tuning matrices and , are positive definite in order to obtain a convex cost function. The closed loop stability is guaranteed by introducing and which represent the terminal set and the terminal constraint, respectively. Both are computed following the design presented in Alcala et al. (2019). Note that the time discretization is embedded inside the identification procedure such that the learned TS system is already in discrete time.
4.2 TSMHE Design
For the vehicle presented in Section 2, vehicle lateral velocity () is an unmeasurable variable and a necessary state to perform the closedloop control of the vehicle. In this paper, we solve the estimation problem using the MHE approach. The aim of the MHE is to compute the current dynamic states by means of running a constrained optimization, using a set of past data and employing a system model for computing the current state. At this point, using the presented TS model in (11), we can run a quadratic optimization similar than in TSMPC algorithm for estimating the current dynamic states as follows
(13)  
s.t.  
that is solved online for
(14) 
where is the constraint region for the dynamic states and its defined as . stands for the past data horizon and the number of states. Matrices and , are positive definite to generate a convex cost function and and represent the error of estimation and the process noise, respectively. State and input vectors are and . Note that, unlike MPC technique, MHE strategy performs an optimization taken into account a window of past vehicle data.
5 Results
The datadriven model identification carried out by the proposed approach is used to learn a statespace TS formulation of the real vehicle dynamics. In Figure 4, the membership functions learned for each input after the offline identification procedure are shown in the left side. These represent the fuzzy rules that will be used online for obtaining the current statespace representation.
Note that, since the discretization time is embedded in the input data of the learning procedure, the selection of a different sampling time is not allowed.
The way of evaluating the MPC and MHE strategies using the datadriven approach presented is through a simulation scenario. In this, a racing situation is proposed where the autonomous scheme presented in Figure 3 is simulated.
First, at every sampling period, i.e. 30 Hz, the racing planner provides the references for the control strategy such that the vehicle will have to behave in a racing driving mode, what directly implies a more challenging control problem. Then, the TSMHE optimal problem presented in (13) is solved for estimating the current vehicle vector state using past vehicle measurements. The next step is to instantiate the TS model matrices for the prediction stage using the approach presented in Section 2. Note that, both the planning evolution information as the previous optimal prediction are used to achieve a good guess of the future values of the scheduling vector . At this point, the quadratic optimal problem (12) is solved using the estimated state variables and the references coming from the trajectory planner. Once the optimal control actions ( and ) are computed they are applied to the simulation vehicle presented in (1). As a consequence, the vehicle change its state and this is measured by the net of sensors. Besides, with the aim of adding more realistic conditions to the problem, white Gaussian noise magnitudes are added to measured states with zero mean and covariances
(15) 
Both TSMPC and TSMHE algorithms are coded in MATLAB framework. Yalmip and GUROBI (Gurobi Optimization, 2015) are used for solving a quadratic optimization problem and running on a DELL inspiron 15 (Intel core i78550U CPU @ 1.80GHzx8). In the controller, the tuning aims to minimize the longitudinal and angular velocity while computing smooth control actions. The diagonal terms of the weighting matrices in the cost function and prediction horizon of (12), found by iterative tuning until the desired performance is achieved, are
(16)  
The TSMPC input constraints are defined as
(17a)  
(17b) 
In the estimator, the tuning aims to minimize the process noise while guessing the right value of by using the TS model. The diagonal terms of the weighting matrices in the cost function and past horizon of (13), found by iterative tuning until the desired performance is achieved, are
(18)  
The TSMHE state region is defined in the polytope
(19a) 
Figure 5 shows both the reference and the response for each one of the velocity variables. Note that, the vehicle lateral velocity () cannot be measured and hence, the signal presented is the estimated one. It can be seen that the controller is able to perfectly track the proposed references although having little troubles when driving in racing mode, i.e. after 85 seconds. Horizontal red lines represent the polytope boundaries for each one of the scheduling variables that in this approach coincide with the state and input vehicle variables. Note that, this limits are imposed in the learning stage by the maximum and minimum values of the input signals, i.e. scheduling variables.
In Figure 6, the optimal control actions are shown as well as their discrete time variations which are the ones minimized in the cost function of (12). Note that the steering angle reaches the upper and lower limits at some points while the rear wheel acceleration moves in a wider range.
Finally, after observing a good tracking performance in last figures, we present the elapsed time per iteration of the TSMPC in Figure 7. It can be seen that, using a prediction horizon of 6 steps, the quadratic solver is able to obtain an average of 4.8 ms. This is one of the most remarkable results of this approach.
6 Conclusions
In this paper, a learningbased approach for identifying the dynamics of the vehicle and formulating them as a TS representation has been presented. Then, a TSMPC strategy has been proposed as the approach to solve autonomous driving control problems under realistic conditions in realtime. In addition, using racingbased references provided by an external planner the controller makes the vehicle to perform in racing mode. The control strategy has been tested in simulation showing high performance potential in both reference tracking and computational time. However, this approach share the limitation of learningbased procedures where you can only do what you learn.
This work has been funded by the Spanish Ministry of Economy and Competitiveness (MINECO) and FEDER through the projects SCAV (ref. DPI201788403R) and HARCRICS (ref. DPI201458104R). The author is supported by a FI AGAUR grant (ref 2017 FI B00433).
References
 TSmpc for autonomous vehicles including a tsmheuio estimator. IEEE Transactions on Vehicular Technology. Cited by: §4.1.
 End to end learning for selfdriving cars. arXiv preprint arXiv:1604.07316. Cited by: §1.
 Aggressive deep driving: model predictive control with a cnn cost model. arXiv preprint arXiv:1707.05303. Cited by: §1.
 Autonomous drifting with onboard sensors. In Advanced Vehicle Control: Proceedings of the 13th International Symposium on Advanced Vehicle Control (AVEC’16), September 1316, 2016, Munich, Germany, pp. 133. Cited by: §2.
 Gurobi optimizer reference manual. URL http://www. gurobi. com. Cited by: §5.

Identification of realistic distillation column using hybrid particle swarm optimization and narx based artificial neural network
. Evolving Systems 10 (2), pp. 149–166. Cited by: §1.  ANFIS: adaptivenetworkbased fuzzy inference system. IEEE transactions on systems, man, and cybernetics 23 (3), pp. 665–685. Cited by: §1, §3.
 Learningbased model predictive control for autonomous racing. IEEE Robotics and Automation Letters 4 (4), pp. 3363–3370. Cited by: §1.
 A learningbased framework for velocity control in autonomous driving. IEEE Transactions on Automation Science and Engineering 13 (1), pp. 32–42. Cited by: §1.
 Autonomous car following: a learningbased approach. In 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 920–926. Cited by: §1.
 Adaptive neurofuzzy inference system application for the identification of a photovoltaic system and the forecasting of its maximum power point. In 2018 7th International Conference on Renewable Energy Research and Applications (ICRERA), pp. 1061–1067. Cited by: §1.
 Learning model predictive control for iterative tasks. a datadriven control framework. IEEE Transactions on Automatic Control 63 (7), pp. 1883–1896. Cited by: §1.
 Learning how to autonomously race a car: a predictive control approach. arXiv preprint arXiv:1901.08184. Cited by: §1.
 Autonomous racing using learning model predictive control. In 2017 American Control Conference (ACC), pp. 5115–5120. Cited by: §1.
 Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017 (19), pp. 70–76. Cited by: §1.
Comments
There are no comments yet.