TS-MPC for Autonomous Vehicle using a Learning Approach

In this paper, the Model Predictive Control (MPC) and Moving Horizon Estimator (MHE) strategies using a data-driven approach to learn a Takagi-Sugeno (TS) representation of the vehicle dynamics are proposed to solve autonomous driving control problems in real-time. To address the TS modeling, we use the Adaptive Neuro-Fuzzy Inference System (ANFIS) approach to obtain a set of polytopic-based linear representations as well as a set of membership functions relating in a non-linear way the different linear subsystems. The proposed control approach is provided by racing-based references of an external planner and estimations from the MHE offering a high driving performance in racing mode. The control-estimation scheme is tested in a simulated racing environment to show the potential of the presented approaches.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

05/06/2021

A Control Architecture for Provably-Correct Autonomous Driving

This paper presents a novel two-level control architecture for a fully a...
06/08/2021

Game-Theoretic Model Predictive Control with Data-Driven Identification of Vehicle Model for Head-to-Head Autonomous Racing

Resolving edge-cases in autonomous driving, head-to-head autonomous raci...
03/09/2018

Model Predictive Control for Autonomous Driving considering Actuator Dynamics

In this paper, we propose a new model predictive control (MPC) formulati...
02/01/2021

MPC path-planner for autonomous driving solved by genetic algorithm technique

Autonomous vehicle's technology is expected to be disruptive for automot...
11/17/2020

Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Accurately modeling robot dynamics is crucial to safe and efficient moti...
05/09/2021

NMPC trajectory planner for urban autonomous driving

This paper presents a trajectory planner for autonomous driving based on...
03/31/2020

A Workload Adaptive Haptic Shared Control Scheme for Semi-Autonomous Driving

Haptic shared control is used to manage the control authority allocation...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years, the amount of learning-based applications has increased immensely. Particularly, in the autonomous driving field, we have witnessed new approaches as the end-end driving where the goal consists on guiding the vehicle by means of using learning algorithms and input sensors data. In Sallab et al. (2017)

, a deep reinforcement learning framework is proposed that takes raw sensor measurements as inputs and outputs driving actions. In

Bojarski et al. (2016)

, authors use a Convolutional Neural Network (CNN) to obtain the appropriate steering signal from the images of a single front camera.

Nowadays, from a control point of view, several strategies are starting to use learning tools to improve their capabilities while guaranteeing overall system stability. We have witnessed an advance in this field reaching learning techniques to adjust controllers, identify some parameters inside models or even control non-linear systems. In Lefèvre et al. (2015a, b), some solutions for controlling the longitudinal velocity of a car based on learning human behaviour are presented. Also, a Model Predictive Control (MPC) technique using deep CNN to predict cost functions from the camera input images is developed in Drews et al. (2017). In Rosolia and Borrelli (2017); Rosolia et al. (2017); Rosolia and Borrelli (2019), the authors propose a reference-free iterative MPC strategy able to learn from previous laps information.

Most of the last approaches were based on learning some policies to drive the vehicle independently of a physical model. In this work, we are interested on learning a realistic and accurate representation of the system dynamics to improve the control performance. In Kabzan et al. (2019), authors use a simple starting vehicle model which is enhanced on-line by learning the model error using Gaussian process regression and measured vehicle data.

In paper, we propose the use of ANFIS approach, that is an adaptive neuro-fuzzy inference system, to learn the vehicle model. In the same manner as artificial neural networks, it works as a universal approximator (Jang, 1993). The main purpose of using ANFIS is to learn an input-output mapping based on input data. This tool has been widely used in other engineering fields (Ndiaye et al., 2018; Jaleel and Aparna, 2019).

The main contribution of this work is to model accurately a non-linear system as a structured Takagi-Sugeno (TS) representation of the vehicle by means of using machine learning tools and input data. In particular, this paper takes advantage of the properties of ANFIS tool to learn a data-driven TS system which will be later used by a predictive optimal control to solve the driving problem.

The paper is structured as follows. Section 2 presents the testing vehicle used in simulation. Section 3 details the proposed learning-based method and its main components. Section 4 formulates the control and estimation problems. Section 5 introduces its application to a case study to assess the methodology, as well as various performance results. Finally, Section 6 presents several conclusions about the method suitability.

2 Testing vehicle

The Berkeley Autonomous Race Car (Gonzales et al., 2016) (BARC111http://www.barc-project.com/) is a development platform for autonomous driving to achieve complex maneuvers. This is a 1/10 scale RWD electric remote control (RC) vehicle (see Figure 1) that has been modified to operate autonomously. Mechanically speaking, this has been modified with some decks to protect the on-board electronics and sensors.

Figure 1: Real picture of the vehicle used for simulation

The non-linear model used in this chapter for simulating the BARC vehicle is presented as

(1)

where the dynamic vehicle variables , and represent the body frame velocities, i.e. linear in , linear in and angular velocities, respectively. The control variables and

are the steering angle at the front wheels and the longitudinal acceleration vector on the rear wheels, respectively.

and are the lateral forces produced in front and rear tires, respectively. This considers the simplified ”Magic Formula” model for simulating lateral tire forces where the parameters , and define the shape of the curve. Front and rear slip angles are represented as and , respectively. and represent the vehicle mass and inertia and and are the distances from the vehicle center of mass to the front and rear wheel axes, respectively. and are the static friction coefficient and the gravity constant, respectively. All the dynamic vehicle parameters are properly defined in Table 1.

Parameter Value Parameter Value
0.125 0.125
1.98 0.03
7.76 1.6
6.0 0.1
Table 1: Dynamic model parameters

In this work, with the aim of improving the simulation, Gaussian noise has been introduced in the measured variables as

(2)

where is the signal covariance.

3 Learning the TS model

In this section, we present the modeling methodology used for obtaining the TS representation of the vehicle dynamic model. The tool ANFIS (Jang, 1993), is an adaptive neuro-fuzzy inference machine that is used for learning a particular structure from input-output data. More in detail, this modeling tool configures a neural network that learns from IO data the dynamic behaviour of the vehicle using back propagation technique and also employing the Recursive Least Squares (RLS) method for adjusting additional parameters.

The methodology consists on providing a dataset to the modeling algorithm (ANFIS). This is composed by vehicle states and inputs that represents a set of particular driving maneuvers guaranteeing rich enough data. Then, after a learning-based procedure, this provides a set of linear parameters, also known as consequent parameters, and a set of premise parameters or non-linear parameters that define the set of membership functions (MF) that provide the non-linear relationships between the different linear polynomials. One typical membership function is the generalized Gaussian function.

However, obtaining the TS representation of a system by means of using this resulting parameters is not trivial. The procedure is based on performing some inverse steps that ANFIS internally does. To address this problem we have to follow a set of reformulating steps. First, due to ANFIS algorithm can be only used for Multi Input Single Output (MISO) systems where just an output variable can be considered. Then, we split the system in MISO sub-systems obtaining as many sub-systems as state variables have the system. Our dynamic vehicle model is a third order system so three sub-systems will be obtained and three learning procedures will be carried out. Once the algorithm has computed conveniently the consequent and premise parameters for each one of the MISO sub-systems, we build the polytopic TS state-space representation for each one of these sub-systems. To do this, first, the polynomial representation of each sub-system is formulated as

(3)

where stands for a linear polynomial representation of the dynamics of a sub-system at a particular states configuration, , are the consequent parameters obtained from ANFIS where stands for the number of scheduling variables (See Figure 2) and represents the number of polytopic vertexes.

Figure 2: Polytopic TS learning scheme for the sub-system case

Then, simply by reorganising the terms in (3) as

(4)

the polynomial structure is transformed into the discrete-time state-space representation given by

(5)

where, in order to easy the comprehension from a control point of view, is represented as the sub-system variable at the next discrete step () with symbol representing the discrete step. , and define the so-called vertex systems, and .

At this point, we use the obtained premise parameters to formulate the membership function. One of the most used is the generalized Gaussian Bell function (). This is defined by three parameters ( and ) as follows

(6)

where represents the ANFIS input vector of variables (from now on we will refer to them as scheduling variables) and and represent the number of MF per scheduling variable and the number of scheduling variables, respectively. For a common case where is two, then, the normalized weights () are computed following

(7)

where corresponds to any of the weighting functions that depend on each rule . Then, using

(8)

the normalized weights are obtained. Note that, each scheduling variable is known and varies in a defined interval . Finally, the polytopic TS model for each sub-system is represented as

(9)

where is the number of sub-systems.

Finally, for this work we consider the third order dynamic system presented in (1) which implies and therefore the overall TS system is represented as

(10)

From now on, with the aim of an easier reading, the system representation in (10) will be expressed as

(11)

4 TS Control and Estimation

In this section, we present the formulations for the MPC and MHE techniques using the TS formulation.

4.1 TS-MPC Design

Computing the predicting states behaviour in a certain horizon when using a system dependent on some scheduling variables (TS system) can be a challenging task sometimes leading to errors in the instantiation since the real future behaviour is unknown.

In this work, we propose the use of data coming from two different locations to approximate in a better way the predictive instantiation and avoid a lack of convergence in the optimal procedure. On the one hand, data coming from a planner is used which represents the desired states behaviour for tracking the desired trajectory. On the other hand, predicted states from the past optimal realisation are also used to improve the TS model instantiation.

Figure 3: Schematical view of the simulation set up

The model used in this section is the one presented in (11) where the vector of scheduling variables is defined as . The use of this model allows to formulate the MPC problem as a quadratic optimization problem that is solved at each time to determine the control actions considering that the values of and are known

(12)
s.t.

where and constraint the system inputs and their variations, respectively.

State vector is , is the estimated state vector, is the reference vector provided by the trajectory planner, is the control input vector and is the prediction horizon. The tuning matrices and , are positive definite in order to obtain a convex cost function. The closed loop stability is guaranteed by introducing and which represent the terminal set and the terminal constraint, respectively. Both are computed following the design presented in Alcala et al. (2019). Note that the time discretization is embedded inside the identification procedure such that the learned TS system is already in discrete time.

4.2 TS-MHE Design

For the vehicle presented in Section 2, vehicle lateral velocity () is an unmeasurable variable and a necessary state to perform the closed-loop control of the vehicle. In this paper, we solve the estimation problem using the MHE approach. The aim of the MHE is to compute the current dynamic states by means of running a constrained optimization, using a set of past data and employing a system model for computing the current state. At this point, using the presented TS model in (11), we can run a quadratic optimization similar than in TS-MPC algorithm for estimating the current dynamic states as follows

(13)
s.t.

that is solved online for

(14)

where is the constraint region for the dynamic states and its defined as . stands for the past data horizon and the number of states. Matrices and , are positive definite to generate a convex cost function and and represent the error of estimation and the process noise, respectively. State and input vectors are and . Note that, unlike MPC technique, MHE strategy performs an optimization taken into account a window of past vehicle data.

5 Results

The data-driven model identification carried out by the proposed approach is used to learn a state-space TS formulation of the real vehicle dynamics. In Figure 4, the membership functions learned for each input after the offline identification procedure are shown in the left side. These represent the fuzzy rules that will be used online for obtaining the current state-space representation.

Figure 4: Input-Output scheme for the sub-system case

Note that, since the discretization time is embedded in the input data of the learning procedure, the selection of a different sampling time is not allowed.

The way of evaluating the MPC and MHE strategies using the data-driven approach presented is through a simulation scenario. In this, a racing situation is proposed where the autonomous scheme presented in Figure 3 is simulated.

First, at every sampling period, i.e. 30 Hz, the racing planner provides the references for the control strategy such that the vehicle will have to behave in a racing driving mode, what directly implies a more challenging control problem. Then, the TS-MHE optimal problem presented in (13) is solved for estimating the current vehicle vector state using past vehicle measurements. The next step is to instantiate the TS model matrices for the prediction stage using the approach presented in Section 2. Note that, both the planning evolution information as the previous optimal prediction are used to achieve a good guess of the future values of the scheduling vector . At this point, the quadratic optimal problem (12) is solved using the estimated state variables and the references coming from the trajectory planner. Once the optimal control actions ( and ) are computed they are applied to the simulation vehicle presented in (1). As a consequence, the vehicle change its state and this is measured by the net of sensors. Besides, with the aim of adding more realistic conditions to the problem, white Gaussian noise magnitudes are added to measured states with zero mean and covariances

(15)

Both TS-MPC and TS-MHE algorithms are coded in MATLAB framework. Yalmip and GUROBI (Gurobi Optimization, 2015) are used for solving a quadratic optimization problem and running on a DELL inspiron 15 (Intel core i7-8550U CPU @ 1.80GHzx8). In the controller, the tuning aims to minimize the longitudinal and angular velocity while computing smooth control actions. The diagonal terms of the weighting matrices in the cost function and prediction horizon of (12), found by iterative tuning until the desired performance is achieved, are

(16)

The TS-MPC input constraints are defined as

(17a)
(17b)

In the estimator, the tuning aims to minimize the process noise while guessing the right value of by using the TS model. The diagonal terms of the weighting matrices in the cost function and past horizon of (13), found by iterative tuning until the desired performance is achieved, are

(18)

The TS-MHE state region is defined in the polytope

(19a)
Figure 5: Vehicle states throughout the simulation. Horizontal red lines represent the upper and lower limits

Figure 5 shows both the reference and the response for each one of the velocity variables. Note that, the vehicle lateral velocity () cannot be measured and hence, the signal presented is the estimated one. It can be seen that the controller is able to perfectly track the proposed references although having little troubles when driving in racing mode, i.e. after 85 seconds. Horizontal red lines represent the polytope boundaries for each one of the scheduling variables that in this approach coincide with the state and input vehicle variables. Note that, this limits are imposed in the learning stage by the maximum and minimum values of the input signals, i.e. scheduling variables.

In Figure 6, the optimal control actions are shown as well as their discrete time variations which are the ones minimized in the cost function of (12). Note that the steering angle reaches the upper and lower limits at some points while the rear wheel acceleration moves in a wider range.

Figure 6: Control actions and their time derivative variables throughout the simulation. Horizontal red lines represent the upper and lower limits

Finally, after observing a good tracking performance in last figures, we present the elapsed time per iteration of the TS-MPC in Figure 7. It can be seen that, using a prediction horizon of 6 steps, the quadratic solver is able to obtain an average of 4.8 ms. This is one of the most remarkable results of this approach.

Figure 7: Computational time required by the TS-MPC throughout the simulation

6 Conclusions

In this paper, a learning-based approach for identifying the dynamics of the vehicle and formulating them as a TS representation has been presented. Then, a TS-MPC strategy has been proposed as the approach to solve autonomous driving control problems under realistic conditions in real-time. In addition, using racing-based references provided by an external planner the controller makes the vehicle to perform in racing mode. The control strategy has been tested in simulation showing high performance potential in both reference tracking and computational time. However, this approach share the limitation of learning-based procedures where you can only do what you learn.

This work has been funded by the Spanish Ministry of Economy and Competitiveness (MINECO) and FEDER through the projects SCAV (ref. DPI2017-88403-R) and HARCRICS (ref. DPI2014-58104-R). The author is supported by a FI AGAUR grant (ref 2017 FI B00433).

References

  • E. Alcala, V. P. Cayuela, and J. Q. Casin (2019) TS-mpc for autonomous vehicles including a ts-mhe-uio estimator. IEEE Transactions on Vehicular Technology. Cited by: §4.1.
  • M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al. (2016) End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. Cited by: §1.
  • P. Drews, G. Williams, B. Goldfain, E. A. Theodorou, and J. M. Rehg (2017) Aggressive deep driving: model predictive control with a cnn cost model. arXiv preprint arXiv:1707.05303. Cited by: §1.
  • J. Gonzales, F. Zhang, K. Li, and F. Borrelli (2016) Autonomous drifting with onboard sensors. In Advanced Vehicle Control: Proceedings of the 13th International Symposium on Advanced Vehicle Control (AVEC’16), September 13-16, 2016, Munich, Germany, pp. 133. Cited by: §2.
  • I. Gurobi Optimization (2015) Gurobi optimizer reference manual. URL http://www. gurobi. com. Cited by: §5.
  • E. A. Jaleel and K. Aparna (2019)

    Identification of realistic distillation column using hybrid particle swarm optimization and narx based artificial neural network

    .
    Evolving Systems 10 (2), pp. 149–166. Cited by: §1.
  • J. Jang (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE transactions on systems, man, and cybernetics 23 (3), pp. 665–685. Cited by: §1, §3.
  • J. Kabzan, L. Hewing, A. Liniger, and M. N. Zeilinger (2019) Learning-based model predictive control for autonomous racing. IEEE Robotics and Automation Letters 4 (4), pp. 3363–3370. Cited by: §1.
  • S. Lefèvre, A. Carvalho, and F. Borrelli (2015a) A learning-based framework for velocity control in autonomous driving. IEEE Transactions on Automation Science and Engineering 13 (1), pp. 32–42. Cited by: §1.
  • S. Lefèvre, A. Carvalho, and F. Borrelli (2015b) Autonomous car following: a learning-based approach. In 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 920–926. Cited by: §1.
  • A. Ndiaye, M. A. Tankari, G. Lefebvre, et al. (2018) Adaptive neuro-fuzzy inference system application for the identification of a photovoltaic system and the forecasting of its maximum power point. In 2018 7th International Conference on Renewable Energy Research and Applications (ICRERA), pp. 1061–1067. Cited by: §1.
  • U. Rosolia and F. Borrelli (2017) Learning model predictive control for iterative tasks. a data-driven control framework. IEEE Transactions on Automatic Control 63 (7), pp. 1883–1896. Cited by: §1.
  • U. Rosolia and F. Borrelli (2019) Learning how to autonomously race a car: a predictive control approach. arXiv preprint arXiv:1901.08184. Cited by: §1.
  • U. Rosolia, A. Carvalho, and F. Borrelli (2017) Autonomous racing using learning model predictive control. In 2017 American Control Conference (ACC), pp. 5115–5120. Cited by: §1.
  • A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani (2017) Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017 (19), pp. 70–76. Cited by: §1.