Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

11/26/2021
by   Phillip Swazinna, et al.
0

Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones. This problem has so far not been discussed in the corresponding literature. We discuss ideas how to select promising datasets and propose three very simple indicators: Estimated relative return improvement (ERI) and estimated action stochasticity (EAS), as well as a combination of the two (COI), and empirically show that despite their simplicity they can be very effectively used for dataset selection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

Behavior Constraining in Weight Space for Offline Reinforcement Learning

In offline reinforcement learning, a policy needs to be learned from a s...
research
03/10/2021

S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning

Offline reinforcement learning proposes to learn policies from large col...
research
10/21/2022

Implicit Offline Reinforcement Learning via Supervised Learning

Offline Reinforcement Learning (RL) via Supervised Learning is a simple ...
research
10/16/2022

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Offline reinforcement learning (RL) can be used to improve future perfor...
research
01/27/2023

Behaviour Discriminator: A Simple Data Filtering Method to Improve Offline Policy Learning

This paper studies the problem of learning a control policy without the ...
research
05/12/2021

Interpretable performance analysis towards offline reinforcement learning: A dataset perspective

Offline reinforcement learning (RL) has increasingly become the focus of...
research
11/27/2002

Dynamic Adjustment of the Motivation Degree in an Action Selection Mechanism

This paper presents a model for dynamic adjustment of the motivation deg...

Please sign up or login with your details

Forgot password? Click here to reset