Recurrent Model Predictive Control

02/23/2021 ∙ by Zhengyu Liu, et al. ∙ 0

This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The number of prediction steps is equal to the number of recurrent cycles of the learned policy function. With an arbitrary initial policy function, the proposed RMPC algorithm can converge to the optimal policy by directly minimizing the designed loss function. We further prove the convergence and optimality of the RMPC algorithm thorough Bellman optimality principle, and demonstrate its generality and efficiency using two numerical examples.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Model Predictive Control (MPC) is a well-known method to solve finite-horizon optimal control problems online, which has been extensively investigated in various fields [1, 2, 3]. However, existing MPC algorithms still suffer from a major challenge: relatively low computation efficiency [4].

One famous approach to tackle this issue is the moving blocking technique, which assumes constant control input in a fixed portion of the prediction horizon. It increases the computation efficiency by reducing the number of variables to be optimized [5]. However, this solution cannot guarantee the system stability and constraint satisfaction. In addition, Wang and Boyd (2009) proposed an early termination interior-point method to reduce the calculation time by limiting the maximum number of iterations per time step [6].

However, these methods are still unable to meet the online computing requirement for nonlinear and large-scale systems. Some control algorithms choose to calculate an near-optimal explicit policy offline, and then implement it online. Bemporad et al. (2002) first proposed the explicit MPC method to increase the computation efficiency, which partitioned the constrained state space into several regions and calculated explicit feedback control laws for each region [7]. During online implementation, the on-board computer only needs to choose the corresponding state feedback control law according to the current system state, thereby reducing the burden of online calculation to some extent. Such algorithms are only suitable for small-scale systems, since the required storage capacity grows exponentially with the state dimension [8].

Furthermore, significant efforts have been devoted to approximation MPC algorithms, which can reduce polyhedral state regions and simplify explicit control laws. Geyer et al. (2008) provided an optimal merging approach to reduce partitions via merging regions with the same control law [9]. Jones et al

. (2010) proposed a polytopic approximation method using double description and barycentric functions to estimate the optimal policy, which greatly reduced the partitions and could be applied to any convex problem

[10]. Wen et al. (2009) proposed a piecewise continuous grid function to represent explicit MPC solution, which reduced the requirements of storage capacity and improve online computation efficiency[11]. Borrelli et al. (2010) proposed an explicit MPC algorithm which can be executed partially online and partially offline[12]

. In addition, some MPC studies employed a parameterized function to approximate the MPC controller. They updated the function parameters by minimizing the MPC cost function with a fixed prediction horizon through supervised learning or reinforcement learning

[13, 14, 15, 16].

Noted that the policy performance and the computation time for each step usually increase with the number of prediction steps. The above-stated algorithms usually have to make a trade-off between control performance and computation time constraints, and select a conservative fixed prediction horizon. While the on-board computation resources are often changing dynamically. These algorithms thus usually lead to calculation timeouts or resources waste. In other words, these algorithms cannot adapt to the dynamic allocation of computing resources and make full use of the available computing time to select the longest model prediction horizon.

In this paper, we propose an offline MPC algorithm, called Recurrent MPC (RMPC), for finite-horizon optimal control problems with large-scale nonlinearities and nonaffine inputs. Our main contributions can be summarized as below:

  1. A recurrent function is employed to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. Compared to previous algorithms employing non-recurrent functions (such as multi-layer NNs), which must select a fixed prediction horizon previously[13, 14, 15, 16], the use of recurrent structure makes the algorithm be able to select appropriate model prediction horizon according to current computing resources. In particular, the output of the learned policy function after recurrent cycles corresponds to the nearly optimal solution of -step MPC.

  2. A policy optimization objective is designed by decomposing the MPC cost function according to the Bellman’s principle of optimality. The optimal recurrent policy can be obtained by directly minimizing the designed objective function. Therefore, unlike the traditional explicit MPC algorithms[7, 8, 9, 10, 11, 12] that can only handle linear systems, the proposed algorithm is applicable for general nonlinear and non input-affine systems. Meanwhile, the proposed RMPC algorithm utilizes the recursiveness of Bellman’s principle. When the cost function of the longest prediction is optimized, the cost function of short prediction will automatically be optimal. Thus the proposed algorithm can deals with different shorter prediction horizons problems while only training with an objective function with respect to a long prediction horizons. Other MPC algorithms [13, 14, 15, 16, 7, 8, 9, 10, 11, 12]do not consider the recursiveness of Bellman’s principle, when the prediction horizons changes, the optimization problem must be reconstructed and the training or computing process must be re-executed to deal with the new problem.

  3. The proposed RMPC algorithm calculates the optimal control policy previously so only needs to eval the forward inferencing process while using online.Expriments shows that it is over 5 times faster than the traditional MPC algorithms [17, 18] under the same problem scale.

The paper is organized as follows. In Section II, we provide the formulation of the MPC problem. Section III presents RMPC algorithm and proves its convergence. In Section IV, we present simulation demonstrations that show the generalizability and effectiveness of the RMPC algorithm. Section V concludes this paper.

Ii Preliminaries

Consider general time-invariant discrete-time dynamic system

(1)

with state , control input and the system dynamics function . We assume that is Lipschitz continuous on a compact set , and the system is stabilizable on .

Define the cost function of the -step Model Predictive Control (MPC) problem

(2)

where is initial state, is length of prediction horizon, is reference trajectory, is the -step cost function of state with reference , is the control input of the th step in -step prediction, and is the utility function. The purpose of MPC is to find the optimal control sequence to minimize the objective , which can be denoted as

(3)

where the superscript represents optimal.

Iii Recurrent Model Predictive Control

Iii-a Recurrent Policy Function

In practical applications, we only need to execute the first control input of the optimal sequence in (3) at each step. Given a control problem, assume that is the maximum feasible prediction horizon. Our aim is to make full use of computation resources and adaptively select the longest prediction horizon , which means that we need to calculate and store the optimal control input of , and in advance. This requires us to find an efficient way to represent the policy and solve it offline.

We firstly introduce a recurrent function, denoted as , to approximate the control input , where

is the vector of function parameters and

is the number of recurrent cycles of the policy function. The goal of the proposed Recurrent MPC (RMPC) algorithm is to find the optimal parameters , such that

(4)

The structure of the recurrent policy function is illustrated in Fig. 1. All recurrent cycles share the same parameters , where is the vector of hidden states.

Each recurrent cycle is mathematically described as

(5)

where , and

are activation functions of hidden layer and output layer, respectively.

Fig. 1: The structure of the recurrent policy function.

As shown in Fig. 1, the recurrent policy function calculates and outputs a control input at each recurrent cycle. Assuming that we have found the optimal parameters , it follows that the output of the th cycle for . This indicates that the more cycles, the longer the prediction horizon. In practical applications, the calculation time of each cycle is different due to the dynamic change of computing resource allocation (see Fig. 2). At each time step, the total time assigned to the control input calculation is assumed to be . Denoting the total number of the recurrent cycles at each time step as , then the control input is , where

Therefore, the recurrent policy is able to make full use of computing resources and adaptively select the longest prediction step . In other word, the more computing resources allocated, the longer prediction horizon will be selected, which usually would lead to the better control performance.

Remark 1.

Previous MPC algorithms employs non-recurrent form neural networks

[13, 14, 15, 16], which must select a fix prediction horizon previously. RMPC employes recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The use of recurrent structure makes the algorithm be able to select appropriate model prediction horizon according to current computing resources. The output of the learned policy network after recurrent cycles corresponds to the nearly optimal solution of -step MPC.

Fig. 2: Maximum recurrent cycles in different cases.

Iii-B Objective Function

To find the optimal parameters offline, we first need to represent the MPC cost function in (2) in terms of , denoted by . From (2) and the Bellman’s principle of optimality, the global minimum can be expressed as:

It follows that

(6)

Therefore, for the same and , it is clear that

(7)

This indicates that the th optimal control input in (3) can be regarded as the optimal control input of the -+-step MPC control problem with initial state . Hence,by replacing all in (2) with , the -step MPC control problem can also be solved via minimizing . Then, we can obtain the -step cost function in terms of :

(8)

To find the optimal parameters that make (4) hold, we can construct the following objective function:

(9)

Therefore, we can update by directly minimizing . The policy update gradients can be derived as

(10)

where

Denoting as and as , we have

By defining two immediate variables, and

, we have their recursive formula and the details of gradient backpropagation are shown in Fig.

3.

Therefore, the gradient formula can be simplified as:

Fig. 3: Gradient backpropagation for recurrent function
Fig. 4: The training flowchart of RMPC algorithm

Taking the Gradient Descent (GD) method as an example, the updating rules of the policy function are

(11)

where denotes the learning rate and indicates th iteration.

The framework and pseudo-code of the proposed RMPC algorithm is shown in Algorithm 1 and Fig. 4.

  Given an appropriate learning rate and any arbitrarily small positive number .
  Initial with arbitrary
  repeat
     Randomly select
     Calculate using (10)
     Update policy function using (11)
  until 
Algorithm 1 RMPC algorithm
Remark 2.

Traditional explicit MPC algorithms[7, 8, 9, 10, 11, 12] can only handle linear systems.The proposed RMPC algorithm uses an optimization objective designed by decomposing the MPC cost function according to the Bellman’s principle of optimality. The optimal recurrent policy can be obtained by directly minimizing the designed objective function without restrictions on the form of systems. Meanwhile, the proposed algorithm utilizes the recursiveness of Bellman’s principle. When the cost function of the longest prediction is optimized, the cost function of short prediction will automatically be optimal. Thus the proposed algorithm can deals with different shorter prediction horizons problems while only training with an objective function with respect to a long prediction horizons. Other MPC algorithms [13, 14, 15, 16, 7, 8, 9, 10, 11, 12]do not consider the recursiveness of Bellman’s principle, when the prediction horizons changes, the optimization problem must be reconstructed and the training or computing process must be re-executed to deal with the new problem.

Iii-C Convergence and Optimality

There are many types of recurrent functions belonging to the structure defined in (5

), and recurrent neural networks (RNN) are the most commonly used. In recent years, deep RNNs have been successfully implemented in many fields, such as natural language processing and system control, attributing to their ability to process sequential data

[19, 20]. Next, we will show that as the iteration index , the optimal policy that make (4) hold can be achieved using Algorithm 1, as long as

is an over-parameterized RNN. The over-parameterization means that the number of hidden neurons is sufficiently large. Before the main theorem, the following lemma and assumption need to be introduced.

Lemma 1.

(Universal Approximation Theorem[21, 22, 23]). Consider a sequence of finite functions , where , is the input dimension, is a continuous function on a compact set and is the output dimension. Describe the RNN as

where is the number of recurrent cycles, , and are parameters, and are activation functions. Supposing is over-parameterized, for any , , such that

where is an arbitrarily small error.

The reported experimental results and theoretical proofs have shown that the straightforward optimization methods, such as GD and Stochastic GD (SGD), can find global minima of most training objectives in polynomial time if the approximate function is an over-parameterized neural network or RNN [24, 25]. Based on this fact, we make the following assumption.

Assumption 1.

If the approximate function is an over-parameterized RNN, the global minimum of objective function in (9) can be found using an appropriate optimization algorithm such as SGD [26].

We now present our main result.

Theorem 1.

(Recurrent Model Predictive Control). Suppose is an over-parameterized RNN. Through Algorithm 1, any initial parameters will converge to , such that (4) holds.

Proof. From Assumption 1, we can always find by repeatedly minimizing using (11), such that

According to the definition of in (9), we have

By Lemma 1, there always , such that

Since is the global minimum of , it follows that

and

Then, according to (6), (7) and the Bellman’s principle of optimality, can also make (4) hold, i.e., .

Thus, we have proven that RMPC algorithm can converge to . In other words, it can find the nearly optimal policy of MPC with different prediction horizon, whose output after th recurrent cycles corresponds to the nearly optimal solution of -step MPC.

Iv Algorithm Verification

In order to evaluate the performance of the proposed RMPC algorithm, we choose the vehicle lateral control problem in path tracking task as an example [27].

Iv-a Overall Settings

The policy network is trained offline on the PC, and then deployed to the industrial personal computer (IPC). The vehicle dynamics used for policy training are different from the controlled plant. For online applications, the IPveC-controller gives the control signal to the plant according to the state information and the reference trajectory. The plant feeds back the state information to the IPC-controller, so as to realize the closed-loop control process. The feedback scheme of the HIL experiment is depicted in Fig 5. The type of IPC-controller is ADLINK MXC-6401, equipped with Intel i7-6820EQ CPU and 8GB RAM, which is used as a vehicle on-board controller[28]. The plant is a real-time system, simulated by the vehicle dynamic model of CarSim [29]. The longitudinal speed is assumed to be constant, , and the expected trajectory is shown in Fig. 16. The system states and control inputs of this problem are listed in Table I, and the vehicle parameters are listed in Table II.

Fig. 5: Schematical view of the experimental setup.
Mode Name Symbol Unit
state Lateral velocity [m/s]
Yaw rate at center of gravity (CG) [rad/s]
Longitudinal velocity [m/s]
Yaw angle [rad]
trajectory [m]
input Front wheel angle [rad]
TABLE I: State and control input
Name Symbol Unit
Front wheel cornering stiffness -88000 [N/rad]
Rear wheel cornering stiffness -94000 [N/rad]
Mass 1500 [kg]
Distance from CG to front axle 1.14 [m]
Distance from CG to rear axle 1.40 [m]

Polar moment of inertia at CG

2420 [kg]
Tire-road friction coefficient 1.0
Sampling frequency 20 [Hz]
System frequency 20 [Hz]
TABLE II: Vehicle Parameters

Iv-B Problem Description

The offline policy is trained based on the nonliner and non input-affine vehicle dynamics:

where and are the lateral tire forces of the front and rear tires respectively [30]. The lateral tire forces can be approximated according to the Fiala tire model:

where is the tire slip angle, is the tire load, is the friction coefficient, and the subscript represents the front or rear tires. The slip angles can be calculated from the relationship between the front/rear axle and the center of gravity (CG):

The loads on the front and rear tires can be approximated by:

The utility function of this problem is set to be

Therefore, the policy optimization problem of this example can be formulated as:

where , , , and .

Iv-C Algorithm Details

The policy function is represented by a variant of RNN, called GRU (Gated Recurrent Unit). The input layer is composed of the states, followed by 4 hidden layers using rectified linear unit (RELUs) as activation functions with

units per layer, and the output layer is set as a layer, multiplied by to confront bounded control. We use Adam method to update the network with the learning rate of and the batch size of .

Iv-D Result Analysis

For nonlinear MPC problems, we can solve it with some optimization solvers, such as ipopt [17] and bonmin [18], which can be approximately regarded as the numerical optimal solution.

Fig. 6 compares the calculation efficiency of RMPC and the optimization solvers based on the symbolic framework CasADi [31] under different prediction steps for online applications. It is obvious that the calculation time of the optimization solvers is much longer than RMPC, and the gap increases with the number of prediction steps. Specifically, when , the fastest optimization solver ipopt is over 5 times slower than RMPC (ipopt for ms, RMPC for ms). This demonstrates the effectiveness of the RMPC method.

Fig. 6: RMPC vs optimization solvers computation time comparison.

We run Algorithm 1 for 10 times and calculate the policy error between the solution of ipopt solver and RMPC at each iteration for ,

where and are respectively the maximum and minimum value of for , , is the number of prediction steps. indicates the relative error of control quantity from cycle network respect to the optimum in step prediction control problem .

Fig. 7:

Policy error during training. Solid lines are average values over 10 runs. Shaded regions correspond to 95% confidence interval.

In Fig. 7, we plot policy error curves during training with different prediction steps . It is clear that all the policy errors decrease rapidly to a small value during the training process. In particular, after iterations, policy errors for all reduce to less than 2%. This indicates that Algorithm 1 has the ability to find the near-optimal policy of MPC problems with different prediction horizons .

Fig. 8 shows the policy performance of the ipopt solver solution and learned policy with different prediction horizons. The policy performance is measured by the lost function of 200 steps (10s) during the simulation period staring from random initialized state, i.e.,

(12)

For all prediction domains , the learned policy performs as well as the solution of ipopt solver. More recurrent cycles (or long prediction steps) help reduce the accumulated cost .

Fig. 8: Performance comparison between the learned policy and the solution of ipopt solver. Solid lines are average values over 50 initialized states. Shaded regions correspond to 95% confidence interval.
Fig. 9: Control output comparison between the learned policy and the solution of ipopt solver, recurrent cycles .
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 16: Tracking results for policies with different recurrent cycles . (a) . (b) . (c) . (d) . (e) . (f) .

In detail, Fig. 16 intuitively presents the control results of the learned policy with different recurrent cycles and Fig. 9 compares the control output between the learned policy(after 15 recurrent cycles) with ipopt controller.Obviously, the trajectory controlled by RMPC controller almost overlaps with the ipopt controller. The more recurrent cycles of the learned policy, the smaller the trajectory tracking error.This is why we want to adaptively select the optimal law with longest prediction horizon in real applications.

To summarize, the example demonstrates the optimality, efficiency and generality of the RMPC algorithm.

V Conclusion

This paper proposes the Recurrent Model Predictive Control (RMPC) algorithm to solve general nonlinear finite-horizon optimal control problems. Unlike traditional MPC algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs an RNN to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The output of the learned policy network after recurrent cycles corresponds to the nearly optimal solution of -step MPC. A policy optimization objective is designed by decomposing the MPC cost function according to the Bellman’s principle of optimality.The optimal recurrent policy can be obtained by directly minimizing the designed objective function, which is applicable for general nonlinear and non input-affine systems. The convergence and optimality of RMPC is further proved. We demonstrate its optimality, generality and efficiency using a HIL experiment. Results show that RMPC is over 5 times faster than the traditional MPC algorithm. The control performance of the learned policy can be further improved as the number of recurrent cycles increases.

References

  • [1] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive control technology,” Control engineering practice, vol. 11, no. 7, pp. 733–764, 2003.
  • [2] S. Vazquez, J. Leon, L. Franquelo, J. Rodriguez, H. A. Young, A. Marquez, and P. Zanchetta, “Model predictive control: A review of its applications in power electronics,” IEEE Industrial Electronics Magazine, vol. 8, no. 1, pp. 16–31, 2014.
  • [3] S. E. Li, Z. Jia, K. Li, and B. Cheng, “Fast online computation of a model predictive controller and its application to fuel economy–oriented adaptive cruise control,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 3, pp. 1199–1209, 2014.
  • [4] J. H. Lee, “Model predictive control: Review of the three decades of development,” International Journal of Control, Automation and Systems, vol. 9, no. 3, p. 415, 2011.
  • [5] R. Cagienard, P. Grieder, E. C. Kerrigan, and M. Morari, “Move blocking strategies in receding horizon control,” Journal of Process Control, vol. 17, no. 6, pp. 563–570, 2007.
  • [6] Y. Wang and S. Boyd, “Fast model predictive control using online optimization,” IEEE Transactions on control systems technology, vol. 18, no. 2, pp. 267–278, 2009.
  • [7] A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos, “The explicit linear quadratic regulator for constrained systems,” Automatica, vol. 38, no. 1, pp. 3–20, 2002.
  • [8] B. Kouvaritakis, M. Cannon, and J. A. Rossiter, “Who needs qp for linear mpc anyway?” Automatica, vol. 38, no. 5, pp. 879–884, 2002.
  • [9] T. Geyer, F. D. Torrisi, and M. Morari, “Optimal complexity reduction of polyhedral piecewise affine systems,” Automatica, vol. 44, no. 7, pp. 1728–1740, 2008.
  • [10] C. N. Jones and M. Morari, “Polytopic approximation of explicit model predictive controllers,” IEEE Transactions on Automatic Control, vol. 55, no. 11, pp. 2542–2553, 2010.
  • [11] C. Wen, X. Ma, and B. E. Ydstie, “Analytical expression of explicit mpc solution via lattice piecewise-affine function,” Automatica, vol. 45, no. 4, pp. 910–917, 2009.
  • [12] F. Borrelli, M. Baotić, J. Pekar, and G. Stewart, “On the computation of linear model predictive control laws,” Automatica, vol. 46, no. 6, pp. 1035–1041, 2010.
  • [13] B. M. Åkesson, H. T. Toivonen, J. B. Waller, and R. H. Nyström, “Neural network approximation of a nonlinear model predictive controller applied to a ph neutralization process,” Computers & chemical engineering, vol. 29, no. 2, pp. 323–335, 2005.
  • [14] B. M. Åkesson and H. T. Toivonen, “A neural network model predictive controller,” Journal of Process Control, vol. 16, no. 9, pp. 937–946, 2006.
  • [15] L. Cheng, W. Liu, Z.-G. Hou, J. Yu, and M. Tan, “Neural-network-based nonlinear model predictive control for piezoelectric actuators,” IEEE Transactions on Industrial Electronics, vol. 62, no. 12, pp. 7717–7727, 2015.
  • [16] J. Duan, Z. Liu, S. E. Li, Q. Sun, Z. Jia, and B. Cheng, “Deep adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints,” arXiv preprint arXiv:1911.11397, 2019.
  • [17] A. Wachter and L. T. Biegler, “Biegler, l.t.: On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming. mathematical programming 106, 25-57,” Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006.
  • [18] P. Bonami, L. T. Biegler, A. R. Conn, G. Cornuéjols, I. E. Grossmann, C. D. Laird, J. Lee, A. Lodi, F. Margot, N. Sawaya et al., “An algorithmic framework for convex mixed integer nonlinear programs,” Discrete Optimization, vol. 5, no. 2, pp. 186–204, 2008.
  • [19] T. Mikolov, M. Karafiát, L. Burget, J. Černockỳ, and S. Khudanpur, “Recurrent neural network based language model,” in Eleventh annual conference of the international speech communication association, 2010.
  • [20] S. Li, H. Wang, and M. U. Rafique, “A novel recurrent neural network for manipulator control with improved noise tolerance,” IEEE transactions on neural networks and learning systems, vol. 29, no. 5, pp. 1908–1918, 2017.
  • [21] L. K. Li, “Approximation theory and recurrent networks,” in Proc. of IJCNN, vol. 2, pp. 266–271.   IEEE, 1992.
  • [22] A. M. Schäfer and H.-G. Zimmermann, “Recurrent neural networks are universal approximators,” International journal of neural systems, vol. 17, no. 04, pp. 253–263, 2007.
  • [23] B. Hammer, “On the approximation capability of recurrent neural networks,” Neurocomputing, vol. 31, no. 1-4, pp. 107–123, 2000.
  • [24]

    Z. Allen-Zhu, Y. Li, and Z. Song, “A convergence theory for deep learning via over-parameterization,” in

    International Conference on Machine Learning

    , pp. 242–252.   Long Beach, California, USA: ICML, 2019.
  • [25] S. Du, J. Lee, H. Li, L. Wang, and X. Zhai, “Gradient descent finds global minima of deep neural networks,” in International Conference on Machine Learning, pp. 1675–1685.   Long Beach, California, USA: ICML, 2019.
  • [26] Z. Allen-Zhu, Y. Li, and Z. Song, “On the convergence rate of training recurrent neural networks,” in Advances in Neural Information Processing Systems, pp. 6673–6685, 2019.
  • [27] R. Li, Y. Li, S. E. Li, E. Burdet, and B. Cheng, “Driver-automation indirect shared control of highly automated vehicles with intention-aware authority transition,” in 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 26–32.   Redondo Beach, California, USA: IEEE, 2017.
  • [28] C. Chen, J. Pan, X. Chang, Q. Xu, and J. Wang, “System design and function verification of an extensible autonomous driving platform,” in 2019 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5.   Sapporo, Japan: IEEE, 2019.
  • [29] R. Benekohal and J. Treiterer, “Carsim: Car-following model for simulation of traffic in normal and stop-and-go conditions,” Transportation research record, vol. 1194, pp. 99–111, 1988. [Online]. Available: http://dx.doi.org/
  • [30] J. Kong, M. Pfeiffer, G. Schildbach, and F. Borrelli, “Kinematic and dynamic vehicle models for autonomous driving control design,” in 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 1094–1099.   Seoul, South Korea: IEEE, 2015.
  • [31] J. A. E. Andersson, G. Joris, H. Greg, J. B. Rawlings, and D. Moritz, “Casadi: a software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, vol. 11, pp. 1–36, 2018.