Convex Optimization-based Policy Adaptation to Compensate for Distributional Shifts

04/05/2023
by   Navid Hashemi, et al.
0

Many real-world systems often involve physical components or operating environments with highly nonlinear and uncertain dynamics. A number of different control algorithms can be used to design optimal controllers for such systems, assuming a reasonably high-fidelity model of the actual system. However, the assumptions made on the stochastic dynamics of the model when designing the optimal controller may no longer be valid when the system is deployed in the real-world. The problem addressed by this paper is the following: Suppose we obtain an optimal trajectory by solving a control problem in the training environment, how do we ensure that the real-world system trajectory tracks this optimal trajectory with minimal amount of error in a deployment environment. In other words, we want to learn how we can adapt an optimal trained policy to distribution shifts in the environment. Distribution shifts are problematic in safety-critical systems, where a trained policy may lead to unsafe outcomes during deployment. We show that this problem can be cast as a nonlinear optimization problem that could be solved using heuristic method such as particle swarm optimization (PSO). However, if we instead consider a convex relaxation of this problem, we can learn policies that track the optimal trajectory with much better error performance, and faster computation times. We demonstrate the efficacy of our approach on tracking an optimal path using a Dubin's car model, and collision avoidance using both a linear and nonlinear model for adaptive cruise control.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2021

Trajectory Optimization of Chance-Constrained Nonlinear Stochastic Systems for Motion Planning and Control

We present gPC-SCP: Generalized Polynomial Chaos-based Sequential Convex...
research
03/07/2021

Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems

Real-time adaptation is imperative to the control of robots operating in...
research
12/08/2021

COSMIC: fast closed-form identification from large-scale data for LTV systems

We introduce a closed-form method for identification of discrete-time li...
research
03/13/2023

Time-Optimal Path Tracking for Cooperative Manipulators: A Convex Optimization Approach

This paper studies the time-optimal path tracking problem for a team of ...
research
04/14/2022

Control-oriented meta-learning

Real-time adaptation is imperative to the control of robots operating in...
research
12/14/2020

Learning how to approve updates to machine learning algorithms in non-stationary settings

Machine learning algorithms in healthcare have the potential to continua...
research
10/21/2022

Validation of Composite Systems by Discrepancy Propagation

Assessing the validity of a real-world system with respect to given qual...

Please sign up or login with your details

Forgot password? Click here to reset