Revisiting Bellman Errors for Offline Model Selection

01/31/2023
by   Joshua P. Zitovsky, et al.
0

Offline model selection (OMS), that is, choosing the best policy from a set of many policies given only logged data, is crucial for applying offline RL in real-world settings. One idea that has been extensively explored is to select policies based on the mean squared Bellman error (MSBE) of the associated Q-functions. However, previous work has struggled to obtain adequate OMS performance with Bellman errors, leading many researchers to abandon the idea. Through theoretical and empirical analyses, we elucidate why previous work has seen pessimistic results with Bellman errors and identify conditions under which OMS algorithms based on Bellman errors will perform well. Moreover, we develop a new estimator of the MSBE that is more accurate than prior methods and obtains impressive OMS performance on diverse discrete control tasks, including Atari games. We open-source our data and code to enable researchers to conduct OMS experiments more easily.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2020

Hyperparameter Selection for Offline Reinforcement Learning

Offline reinforcement learning (RL purely from logged data) is an import...
research
07/23/2021

Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings

Reinforcement learning (RL) can be used to learn treatment policies and ...
research
10/16/2022

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Offline reinforcement learning (RL) can be used to improve future perfor...
research
11/03/2022

Oracle Inequalities for Model Selection in Offline Reinforcement Learning

In offline reinforcement learning (RL), a learner leverages prior logged...
research
06/01/2023

Improving Offline RL by Blending Heuristics

We propose Heuristic Blending (HUBL), a simple performance-improving tec...
research
11/01/2021

Validate on Sim, Detect on Real – Model Selection for Domain Randomization

A practical approach to learning robot skills, often termed sim2real, is...

Please sign up or login with your details

Forgot password? Click here to reset