Delay-aware Robust Control for Safe Autonomous Driving

09/15/2021 ∙ by Dvij Kalaria, et al. ∙ IIT Kharagpur Carnegie Mellon University 1

With the advancement of affordable self-driving vehicles using complicated nonlinear optimization but limited computation resources, computation time becomes a matter of concern. Other factors such as actuator dynamics and actuator command processing cost also unavoidably cause delays. In high-speed scenarios, these delays are critical to the safety of a vehicle. Recent works consider these delays individually, but none unifies them all in the context of autonomous driving. Moreover, recent works inappropriately consider computation time as a constant or a large upper bound, which makes the control either less responsive or over-conservative. To deal with all these delays, we present a unified framework by 1) modeling actuation dynamics, 2) using robust tube model predictive control, 3) using a novel adaptive Kalman filter without assuminga known process model and noise covariance, which makes the controller safe while minimizing conservativeness. On onehand, our approach can serve as a standalone controller; on theother hand, our approach provides a safety guard for a high-level controller, which assumes no delay. This can be used for compensating the sim-to-real gap when deploying a black-box learning-enabled controller trained in a simplistic environment without considering delays for practical vehicle systems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The recent surge in autonomous driving has led to an increase in demand for making them more affordable and accessible. The stringent requirements of onboard computing and high-resolution sensors pose a major challenge in this demand. While there has been much work on making algorithms more efficient, the required computation time is still dangerous in fast-changing environments such as high-speed scenarios.

Most commonly used control algorithms assume the planned command to be applied instantaneously. This assumption can generate significant tracking errors and jeopardize stability. This work presents a unified approach to dealing with three types of delay: 1) computation time delay; 2) actuator command processing delay; and 3) actuator dynamics delay. First, we model the actuator’s steering dynamics delay using a first-order ordinary differential equation (ODE). Second, we propose a delay-aware robust tube Model Predictive Control (MPC). It is coupled with our proposed filter, called INFLUENCE - adaptIve kalmaN FiLter with Unknown process modEl and Noise CovariancE (including process noise and measurement noise). INFLUENCE is a novel adaptive Kalman filter variant combining online identification of an unknown dynamic model and estimation of noise covariances. INFLUENCE probabilistically safely estimates the computation time delay. We propose two controller plans: the first plan serves as a standalone controller for a delay-aware robust control; in the second controller plan, we compensate a learning-enabled (LE) primary controller to boost its safety performance. This controller plan has application merit, since LE controllers are being used in autonomous systems. However, simplistic assumptions are usually made in the training procedure of the LE controllers. For safety-critical systems such as autonomous driving, it is crucial to close the sim-to-real gap by providing a safety-assured guard. We treat the LE controller as a high-level controller. In the low-level control part, we track the reference generated by the LE controller, but actively regulate its unsafe control by delay compensation.

In summary, we make the following novel contributions:

1. A unified delay-aware robust control approach dealing with three major delays: computation time delay, actuator command processing delay, and actuator dynamics delay.

2. INFLUENCE is a probabilistic framework for real-time estimation of computation time instead of taking an upper bound, which makes the algorithm safe while minimizing conservativeness. INFLUENCE has application merit for general safe prediction problems, since it does not assume known process model and noise covariance.

3. A control plan to safely compensate for these delays for a LE controller which doesn’t consider delays.

The rest of this paper is organized as follows. Section II provides a review of some important related work. Section III describes the methodology. Section IV presents the experimental results. The conclusions are in Section V.

Ii Related works

In the low-level control layer of a vehicle system, many algorithms ignore the delay arising from various factors such as computation, actuator command processing, sensor delay, and actuator dynamics. For the compensation of computation delay for discrete MPC, [2] proposes the simple solution of shifting the initial state by one control step, approximating it from the prediction model. However, this is not suitable if the computation time is more than one control step. [15] further proposes to use a buffer to store control commands from the previous batch. It also proposes the use of a pre-compensating unit and robust tube MPC to prove the safety of the system under bounded perturbation. This plan is well suited for static scenarios, where the objective is nearly constant throughout. However, for highly dynamic scenarios where the planned trajectory changes rapidly, taking the upper bound as the horizon length might lead to the algorithm’s being less responsive. We instead propose to use an adaptive Kalman filter to approximate a local upper bound on computation time and adapt to its changing values instead of taking an upper bound as the horizon length.

For compensation of delay due to actuator dynamics, [12] proposes approximating the actuator dynamics by a first-order ODE, after which the actuator state can be augmented in the state space model. The approach has been tested on differential braking stability control. Instead, we use a similar design for compensating steering delay.

For considering delay caused due to processing of actuator commands at the actuator, [8] proposes an initial state transition method similar to the compensation for computation delay, but a preview continuous controller. It proposes a closed-loop solution to compensate for a dead time between when the command is planned and when it reaches the actuator. [6] further extends the idea by including compensation for actuator saturation, as well to make the solution deployable on real systems with control limits. However, with the use of the preview controller, it becomes difficult to include state constraints in the system. [10] proposes a simple way to compensate for the sensor delay by transforming the frame of sensor values to reflect values at the current time and not at the time when they were recorded. Our work, however, considers computation delay, actuator command processing delay, and actuator dynamics as well as control and state constraints under one optimization framework in the context of autonomous vehicle control.

Iii Methodology

Iii-a Notation

A polytope is defined as a convex hull of finite points in -dimensional space as . The Minkowski sum of two polytopes is defined as . The Pontryagin difference of two polytopes is defined as .

Iii-B System dynamics

A kinematic bicycle model describes the dynamics of the vehicle with the state variables being position (, ), heading angle (), and velocity (), and the control variables acceleration () and steering angle (). It is commonly assumed in the literature that steering angle and acceleration are applied instantaneously by the actuators. But in reality, there is a certain lag between the command and when the actuator physically modifies the steering angle state which is called the actuator dynamic delay. We modify the system dynamics to also include the actual steering angle as a state denoted as . The control command is now the desired steering angle . We approximate the change in the steering angle state by a first-order ODE similar to [12], i.e., , where is the inverse of the time constant for the steering actuator. For acceleration, pedal dynamics are assumed to be instantaneous for the experiments in this paper. However, they can also be approximated in the same way. After discretization, the vehicle state is now modified as . The discrete dynamics are given in Eq. 1.

(1a)
(1b)
(1c)
(1d)
(1e)
where the curvature , is the vehicle length, and the travel distance .

Iii-C Robust Tube MPC

For a discrete linear system with system matrices and , let the control gain be such that the feedback system of is stable. Let be the disturbance-invariant set for the controlled uncertain system , satisfying , where the disturbance is assumed to be bounded () by a polyhedron that contains the origin in its interior. The following finite optimization problem is solved at each step for , and reference state sequence obtained from the path planner, where is the horizon length, and and are the state and control sequences of a nominal system ignoring .

(2)

where , and are the state, control and terminal state cost matrices, respectively. The control command given would be , where is the current state. This guarantees for any , i.e., all states will be inside the constraint set . However, for a nonlinear system as in our case, we use the equivalent LTV system, where the system matrices and are replaced with Jacobian matrices and at the current state for the system dynamics used in Eq. 1. For more details as well as a detailed proof of the feasibility and the stability of the above controller, see [14]. Also, a mismatch between the linearized and the actual model can be compensated by adding an additional disturbance assuming the model non-linear function to be a Lipschitz function [4]. For the experiments in this work, we assume the disturbance margin is large enough to cover this extra disturbance.

Iii-D Delay-aware robust tube MPC

For the above formulation, we assumed delay time to be zero, meaning that the computed command is delivered to the system at the same time the observation is made for the current state . But in practice, there is computation time denoted as and an actuator command processing delay after the calculated command is delivered, resulting in a total delay time . Hence, if the current state is observed at time , the computed command influences the actuator state at time. This may lead to instability of the system if is large and the robust tube assumptions no longer hold true. In order to tackle this problem, [15] proposes a bi-level control scheme to deal with time delay and also proves robustness using the tube MPC. A buffer block of commands is used for communication between higher- and lower-level units, as depicted in Figure 1. At time , the set of possible states at time , where is the horizon length, is predicted. The high-level tube MPC updates the buffer with nominal states and control commands from time to . If the higher-level MPC requires a delay time that is less than , the system waits for the remaining time for time alignment. However, we believe this is only suited for systems when the objective is nearly constant. For dynamically changing objectives as well as state constraints, it is necessary to update the reference path more frequently, since waiting for the full horizon path to be followed may lead to inconsistencies. In the case of autonomous driving, for dynamic scenarios where the reference path has to be updated frequently, it would not be feasible to wait for to get a new updated path. We propose to get a local probabilistic upper-bound estimate of the computation time. We update the buffer from ( to ) instead of from ( to ), as shown in Fig. 1. This increases the controller plan update rate for the higher-level MPC and also makes the controller robust to changing computation times. For estimation of the local upper bound , we use an adaptive Kalman filter, as further described in Section III-E. Considering to be the extra delay due to actuator command processing, we thus get a new local upper-bound delay time estimate , which we use to find the initial state estimate assuming no disturbance after time given the current state . It can be calculated by piece-wise integration of the system dynamics using the control commands from the buffer. Hence, a command issued at will be executed at , where we consider our initial state to be for optimization as depicted in Fig. 1. Mathematically, the updated objective function is described in Eq. 3, where is the actual state after time . The calculated nominal discrete states () and controls () are used to fill the buffer B from time to as for , as shown in Figure 1. The pre-compensator unit is a low-level process which executes commands in the buffer at a higher frequency than the high-level MPC.

Fig. 1: Dual cycle control scheme for tube MPC with delay
(3)

Iii-D1 Control Constraints ()

Limits on acceleration and steering are formulated as control constraints for the optimization problem.

Iii-D2 State Constraints ()

For the state variables and , we set the upper and lower bound for their range. However, for the state variables and , free space is non-convex in nature, hence it becomes quite computationally expensive to set them in the non-convex form for the optimization problem. Hence we use the IRIS algorithm we have used in previous work [7] to derive a set of convex constraints which can be used for efficient optimization of the path tracking problem while also ensuring safety through collision avoidance. IRIS optimizes the objective of finding linear constraints for each obstacle such that the resultant convex space fits the largest possible ellipsoid. We set the seed for IRIS as the predicted position (without uncertainty) of the vehicle after time to get the resultant convex space .

Iii-D3 Disturbance-invariant set ()

The disturbance-invariant set can be over-approximated using [13]. However, in the presence of delay, would be different for the next control cycle, hence for robustness, must be sufficient to be covered by .

Theorem 1.

Given the optimization in Eq. 3, the disturbance-invariant set is calculated as , i.e., a union set with all possible values of initial heading angle, speed and steering angle within the admissible range, which determines the Jacobian matrix . The invariant set guarantees robust initialization of the optimization problem.

Proof.

is the expected state assuming no disturbance after for the current time , and the application of with feedback. is the actual state at . We have , where , From Eq. 3, we have . Hence, in order to guarantee robustness, , we need to ensure has a valid solution, i.e., . As and , we can establish

(4)

Thus, . is the Jacobian matrix, which depends on , , and of . Let’s define set , which consists of all possible matrices.

(5)

Thus, . is chosen as , which concludes the proof. ∎

Iii-E Estimating computation time

We propose INFLUENCE for estimating a local upper bound on computation time. The conventional Kalman filter faces a crucial challenge when the dynamic model and noise covariance are unknown. On the subject of adaptive filters addressing this challenge, existing approaches either assume an unknown dynamic model but known covariance [9] or unknown covariance but a known dynamic model [5, 11]. INFLUENCE assumes that

in the process model, process noise variance

and measurement noise variance

are all unknown. Since measuring time is direct but noisy, it is reasonable to assume the measurement matrix to be the identity matrix. However, if the assumption does not hold in other applications, the identification can be facilitated using the similar approach for process model identification in the INFLUENCE algorithm. We assume both noise distributions to be Gaussian, independent and mutually uncorrelated throughout. To make the optimization tractable, INFLUENCE iteratively fixes

first to identify and (see Eq. 6c to Eq. 6k), which are then fixed to update (see Eq. 6l to Eq. 6m). For the identification of and , an exponential moving average is maintained to estimate prediction error and measurement error , respectively. Parameters and determine the influence of older values on the moving average: the greater their values, the greater the weighting of older values. Thus, an incremental update for and can be done in Eq. 6d and Eq. 6j. These update rules have been adapted from [5, 11], while the update rules for state transition parameters have been adapted from [9]. is the learning gain, and denotes the forgetting factor for estimation of , where is the buffer size for the process model’s identification. The initialization of INFLUENCE is in Eq. 6a.

(6a)
(6b)
(6c)
(6d)
(6e)
(6f)
(6g)
(6h)
(6i)
(6j)
(6k)
(6l)
(6m)
(6n)

where is the observed computation time at step , and

is the first element of the vector at step

.

For the local upper bound estimate, we use the predicted value and variance to get an upper-bound estimate on computation time, i.e., . We choose the parameter

accordingly to get sufficiently high confidence as an upper bound assuming a Gaussian distribution.

Iii-F Controller Plan A

We present a standalone controller (called plan A) as depicted in Figure 2. We compensate for the actuator’s steering dynamics by modelling a first-order ODE. For the compensation of the computation and actuator command processing delays, we use initial state shift by the estimated local upper bound on the net delay time. The optimization problem updates the robust tube buffer from to with the nominal commands and states. The pre-compensator unit runs as a low-level process to refine the control with a higher frequency (see Figure 2).

Fig. 2: Controller Plan A for Robust tube MPC. If the true delat is smaller than the estimated , we need to wait for a time alignment.

Iii-G Controller Plan B

As an alternate plan, we compensate for the actuator and computation delay of a nominal controller (see Figure 3). It can be a black-box controller, which can be used for a LE controller trained in a simplified simulation environment without all practical delays. For the computation time and actuator processing delay compensation, we use the same design as plan A by shifting the initial state. For actuator dynamic delay compensation, we use a separate unit which takes the commands from the nominal controller as reference to track. The computation time and actuator processing delays are estimated via Plan A to shift the initial state of the LE controller and conduct rollouts to obtain sequential commands . The refined commands after compensating for actuator dynamic delay are . is obtained by solving a quadratic optimization problem (Eq. 7) where is the current value of the steering angle, is the unit step response of the steering actuator at the time step, and and are positive semidefinite weight matrices. The optimization is to track the desired actuator commands from the LE controller as closely as possible while minimizing the control effort. Though collision avoidance is not considered in the experiments for Plan B, safety constraints such as control barrier functions [1] can be easily augmented into the optimization.

Fig. 3: Controller Plan B for black box controller.
(7)

where .

Iv Experimental Results

We conduct the experiments in the Gazebo simulator with a Prius vehicle model. In order to get the time constant value for the steering actuator, we test its unit response, i.e., we set the actuator command to 1 and record the steering angle values over a time window sufficient for the steering angle to converge at the maximum value. We then fit the observed response values with the first-order ODE described in Section III-B and determine the parameter . As shown in Figure 4, using well approximates the actuator dynamics for the Prius model in Gazebo.

Fig. 4: Steering angle response for Prius model in Gazebo

Iv-a Static scenario

For testing controller plan A, we perform a static obstacle avoidance experiment. The planner used is hybrid A [3]. We compare the paths followed by the controller with and without considering delays. In the case in which no compensation is considered, due to the delays, the vehicle overshoots during the first turn and when it tries to get back to the reference safely, the vehicle collides with the static obstacle as marked by pose B in Fig. 5. In the cases in which delay compensations are considered, the resulting path followed is closer to the reference line and smoother, while safely avoiding collision. We also compare the results between when 1) the delay time is taken as an upper bound equal to the horizon length [15] and 2) the local upper bound estimate is found using INFLUENCE. The path followed using our proposed method is clearly seen to be smoother in Fig. 5. This is because in the case of constant delay compensation, at pose A, the state constraints generated from IRIS force the vehicle to deviate from the reference path, which thus overshoots by a significant amount, but is still able to get back to the reference safely. On the other hand, if we approximate the delay time using INFLUENCE and adjust the expected local upper bound value accordingly (see Figure 6), the controller responds faster. Hence, after passing pose A, the state constraints change and the controller reacts faster to get back to the reference path, giving less overshoot.

Fig. 5: Comparison on paths followed. Yellow region is the convex state constraint set from IRIS at pose A for Experiment IV-A. The red box is a static obstacle.
Fig. 6: Observed and estimated computation times using the INFLUENCE for Experiment IV-A.

Iv-B Overtaking scenario

We further test the controller plan A in an overtaking scenario, in which the lead vehicle brakes suddenly at point A when . We use the Frenet planner [16] with the reference path as the lane center. Frenet frame-based planning has been successful in practice due to the significant advantage of its independence from complex road geometry. We perform the experiment with the same starting conditions and compare the results with (Figure 6(a)) and without (Figure 6(b)) delay compensation. The Frenet planner expects the ego vehicle to move at constant speed, but as the speed rapidly drops at point A, the reference path changes rapidly. The Frenet planner thus rapidly changes path after point A. Point B is the closest position between the ego vehicle and lead vehicle in all the cases. If delay time is not considered, the ego vehicle hits the other vehicle slightly at point B. Also, in this case if computation time is taken as a constant upper bound of , due to the slow reaction of the controller, the ego vehicle hits the lead vehicle at point B (Figure 6(c)), which proves that taking the computation time as an upper bound is ineffective in rapidly changing environment.

(a) With variable local upper bound on computation delay.
(b) Without computation delay consideration.
(c) With constant upper bound on computation delay (0.2s).
Fig. 7: Comparison in Experiment IV-B. The red boxes denote the ego vehicle at A and B, while the blue boxes denote the opponent vehicle at and .

Iv-C Closed track scenario

We test controller plan B for a LE controller trained in an ideal environment without delays. The LE lateral controller is a neural network trained on waypoint following with inputs

, where and are respectively the relative position and heading with a target waypoint. The output is the steering angle

. The network architecture is a simple feed-forward neural network with hidden layer sizes

. For longitudinal control, simple PID control is used to track a constant speed of throughout. When deploying the LE controller in such a high-speed waypoint-following scenario in Gazebo, it performs worse due to the practical delays. The LE controller loses control at the time of turning, see Figure 8. By using the proposed plan B, the vehicle retains control. The vehicle is operating at its friction limits, hence even a little bit of error caused due to delay leads to the vehicle losing control, even when the computation time is just 0.02s on an average.

Fig. 8: Comparison of high-speed tracking for Experiment IV-C

V Conclusion

We propose a unified framework for compensating the delays of computation, actuator command processing and actuator dynamics in autonomous driving systems. We propose the INFLUENCE algorithm to safely approximate the computation time. With the use of tube MPC, the vehicle safely tracks the planned trajectories in realistic scenarios tested in the high-fidelity Gazebo simulator. Lastly, we present a framework for compensating delays for a black-box controller trained in an ideal environment. The simulation results demonstrate safety and real-time performance of our proposed framework.

References

  • [1] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada (2019) Control barrier functions: theory and applications. In 2019 18th European Control Conference (ECC), pp. 3420–3431. Cited by: §III-G.
  • [2] P. Cortes, J. Rodriguez, C. Silva, and A. Flores (2011) Delay compensation in model predictive current control of a three-phase inverter. IEEE Transactions on Industrial Electronics 59, pp. 1323–1325. Cited by: §II.
  • [3] D. Dolgov, S. Thrun, M. Montemerlo, and J. Diebel (2008) Practical search techniques in path planning for autonomous driving. Ann Arbor 1001 (48105), pp. 18–80. Cited by: §IV-A.
  • [4] Y. Gao, A. Gray, H. E. Tseng, and F. Borrelli (2014) A tube-based robust nonlinear predictive control approach to semiautonomous ground vehicles. Vehicle System Dynamics 52 (6), pp. 802–823. Cited by: §III-C.
  • [5] I. Hashlamon and K. Erbatur (2013-12) An improved real-time adaptive kalman filter with recursive noise covariance updating rules. Turkish Journal of Electrical Engineering and Computer Sciences, pp. . External Links: Document Cited by: §III-E.
  • [6] N. E. Kahveci and P. A. Ioannou (2011) Automatic steering of vehicles subject to actuator saturation and delay. In 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Vol. , pp. 119–124. Cited by: §II.
  • [7] S. Khaitan, Q. Lin, and J. M. Dolan (2020) Safe planning and control under uncertainty for self-driving. arXiv preprint arXiv:2010.11063. Cited by: §III-D2.
  • [8] Y. Liao and F. Liao (2018) Design of preview controller for linear continuous-time systems with input delay. International Journal of Control, Automation and Systems 16 (3), pp. 1080–1090. Cited by: §II.
  • [9] C. Liu and M. Tomizuka (2015) Safe exploration: addressing various uncertainty levels in human robot interactions. In 2015 American Control Conference (ACC), pp. 465–470. Cited by: §III-E.
  • [10] J. Liu, P. Jayakumar, J. L. Stein, and T. Ersal (2014-10) A Multi-Stage Optimization Formulation for MPC-Based Obstacle Avoidance in Autonomous Vehicles Using a LIDAR Sensor. Dynamic Systems and Control Conference, Vol. Volume 2: Ground and Space Vehicle Dynamics. Cited by: §II.
  • [11] K. Myers and B. Tapley (1976) Adaptive sequential estimation with unknown noise statistics. IEEE Transactions on Automatic Control 21 (4), pp. 520–523. Cited by: §III-E.
  • [12] A. Nahidi, A. Khajepour, A. Kasaeizadeh, S. Chen, and B. Litkouhi (2019) A study on actuator delay compensation using predictive control technique with experimental verification. Mechatronics 57, pp. 140–149. Cited by: §II, §III-B.
  • [13] S. V. Raković, E. Kerrigan, K. Kouramas, and D.Q. Mayne (2004) Invariant approximations of robustly positively invariant sets for constrained linear discrete-time systems subject to bounded disturbances. University of Cambridge, Department of Engineering Cambridge. Cited by: §III-D3.
  • [14] R.S. Smith (2004) Robust model predictive control of constrained linear systems. In Proceedings of the 2004 American Control Conference, Vol. 1, pp. 245–250 vol.1. Cited by: §III-C.
  • [15] Y. Su, K. K. Tan, and T. H. Lee (2013) Computation delay compensation for real time implementation of robust model predictive control. Journal of Process Control 23 (9), pp. 1342–1349. Cited by: §II, §III-D, §IV-A.
  • [16] M. Werling, J. Ziegler, S. Kammel, and S. Thrun (2010) Optimal trajectory generation for dynamic street scenarios in a frenet frame. In 2010 IEEE International Conference on Robotics and Automation, Cited by: §IV-B.