Multifidelity Reinforcement Learning with Control Variates

06/10/2022
by   Sami Khairy, et al.
6

In many computational science and engineering applications, the output of a system of interest corresponding to a given input can be queried at different levels of fidelity with different costs. Typically, low-fidelity data is cheap and abundant, while high-fidelity data is expensive and scarce. In this work we study the reinforcement learning (RL) problem in the presence of multiple environments with different levels of fidelity for a given control task. We focus on improving the RL agent's performance with multifidelity data. Specifically, a multifidelity estimator that exploits the cross-correlations between the low- and high-fidelity returns is proposed to reduce the variance in the estimation of the state-action value function. The proposed estimator, which is based on the method of control variates, is used to design a multifidelity Monte Carlo RL (MFMCRL) algorithm that improves the learning of the agent in the high-fidelity environment. The impacts of variance reduction on policy evaluation and policy improvement are theoretically analyzed by using probability bounds. Our theoretical analysis and numerical experiments demonstrate that for a finite budget of high-fidelity data samples, our proposed MFMCRL agent attains superior performance compared with that of a standard RL agent that uses only the high-fidelity environment data for learning the optimal policy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Multifidelity Deep Operator Networks

Operator learning for complex nonlinear operators is increasingly common...
research
10/11/2018

One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL

Humans are experts at high-fidelity imitation -- closely mimicking a dem...
research
01/24/2022

Structural Properties of Optimal Fidelity Selection Policies for Human-in-the-loop Queues

We study optimal fidelity selection for a human operator servicing a que...
research
12/18/2017

Multi-Fidelity Reinforcement Learning with Gaussian Processes

This paper studies the problem of Reinforcement Learning (RL) using as f...
research
10/21/2016

Minimax Error of Interpolation and Optimal Design of Experiments for Variable Fidelity Data

Engineering problems often involve data sources of variable fidelity wit...
research
06/19/2021

Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control with Scarce Data and Side Information

We develop a learning-based control algorithm for unknown dynamical syst...
research
02/09/2023

High-fidelity Interpretable Inverse Rig: An Accurate and Sparse Solution Optimizing the Quartic Blendshape Model

We propose a method to fit arbitrarily accurate blendshape rig models by...

Please sign up or login with your details

Forgot password? Click here to reset