FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis

by   Aman Sinha, et al.

Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorithmic contributions to both challenges. First, to generate a realistic, diverse set of opponents, we develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo. Second, we propose a distributionally robust bandit optimization procedure that adaptively adjusts risk aversion relative to uncertainty in beliefs about opponents' behaviors. We rigorously quantify the tradeoffs in performance and robustness when approximating these computations in real-time motion-planning, and we demonstrate our methods experimentally on autonomous vehicles that achieve scaled speeds comparable to Formula One racecars.



There are no comments yet.


page 1

page 2

page 3

page 4


Online Risk-Bounded Motion Planning for Autonomous Vehicles in Dynamic Environments

A crucial challenge to efficient and robust motion planning for autonomo...

Interactive multi-modal motion planning with Branch Model Predictive Control

Motion planning for autonomous robots and vehicles in presence of uncont...

Towards Learning Multi-agent Negotiations via Self-Play

Making sophisticated, robust, and safe sequential decisions is at the he...

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

Out-of-training-distribution (OOD) scenarios are a common challenge of l...

UW-MARL: Multi-Agent Reinforcement Learning for Underwater Adaptive Sampling using Autonomous Vehicles

Near-real-time water-quality monitoring in uncertain environments such a...

A Survey of Deep Reinforcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles

In this survey, we systematically summarize the current literature on st...

Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks

Autonomous robots need to interact with unknown, unstructured and changi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.