AI Chat AI Image Generator AI Video Text to Speech

Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

11/26/2021

∙

by Phillip Swazinna, et al.

∙

∙

Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones. This problem has so far not been discussed in the corresponding literature. We discuss ideas how to select promising datasets and propose three very simple indicators: Estimated relative return improvement (ERI) and estimated action stochasticity (EAS), as well as a combination of the two (COI), and empirically show that despite their simplicity they can be very effectively used for dataset selection.

Phillip Swazinna
7 publications
Steffen Udluft
19 publications
Thomas Runkler
27 publications

page 1

page 2

page 3

page 4

research

∙ 07/12/2021

Behavior Constraining in Weight Space for Offline Reinforcement Learning

In offline reinforcement learning, a policy needs to be learned from a s...

0 Phillip Swazinna, et al. ∙

research

∙ 03/10/2021

S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning

Offline reinforcement learning proposes to learn policies from large col...

0 Samarth Sinha, et al. ∙

research

∙ 10/21/2022

Implicit Offline Reinforcement Learning via Supervised Learning

Offline Reinforcement Learning (RL) via Supervised Learning is a simple ...

0 Alexandre Piché, et al. ∙

research

∙ 10/16/2022

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Offline reinforcement learning (RL) can be used to improve future perfor...

0 Allen Nie, et al. ∙

research

∙ 01/27/2023

Behaviour Discriminator: A Simple Data Filtering Method to Improve Offline Policy Learning

This paper studies the problem of learning a control policy without the ...

0 Qiang Wang, et al. ∙

research

∙ 05/12/2021

Interpretable performance analysis towards offline reinforcement learning: A dataset perspective

Offline reinforcement learning (RL) has increasingly become the focus of...

6 Chenyang Xi, et al. ∙

research

∙ 11/27/2002

Dynamic Adjustment of the Motivation Degree in an Action Selection Mechanism

This paper presents a model for dynamic adjustment of the motivation deg...

0 Carlos Gershenson, et al. ∙