Guarantees on Robot System Performance Using Stochastic Simulation Rollouts

09/19/2023
by   Joseph A. Vincent, et al.
0

We provide finite-sample performance guarantees for control policies executed on stochastic robotic systems. Given an open- or closed-loop policy and a finite set of trajectory rollouts under the policy, we bound the expected value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and the probability of failure in a sparse rewards setting. The bounds hold, with user-specified probability, for any policy synthesis technique and can be seen as a post-design safety certification. Generating the bounds only requires sampling simulation rollouts, without assumptions on the distribution or complexity of the underlying stochastic system. We adapt these bounds to also give a constraint satisfaction test to verify safety of the robot system. Furthermore, we extend our method to apply when selecting the best policy from a set of candidates, requiring a multi-hypothesis correction. We show the statistical validity of our bounds in the Ant, Half-cheetah, and Swimmer MuJoCo environments and demonstrate our constraint satisfaction test with the Ant. Finally, using the 20 degree-of-freedom MuJoCo Shadow Hand, we show the necessity of the multi-hypothesis correction.

READ FULL TEXT

page 1

page 8

research
03/15/2022

Modern Lower Bound Techniques in Database Theory and Constraint Satisfaction

Conditional lower bounds based on P≠ NP, the Exponential-Time Hypothesis...
research
12/14/2022

Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning

Learning a risk-aware policy is essential but rather challenging in unst...
research
12/13/2022

Statistical Safety and Robustness Guarantees for Feedback Motion Planning of Unknown Underactuated Stochastic Systems

We present a method for providing statistical guarantees on runtime safe...
research
10/14/2022

Probably Approximately Correct Nonlinear Model Predictive Control (PAC-NMPC)

Approaches for stochastic nonlinear model predictive control (SNMPC) typ...
research
04/21/2022

Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

The dramatic increase of autonomous systems subject to variable environm...
research
03/18/2022

Sampling Complexity of Path Integral Methods for Trajectory Optimization

The use of random sampling in decision-making and control has become pop...

Please sign up or login with your details

Forgot password? Click here to reset