Data Valuation for Offline Reinforcement Learning

05/19/2022
by   Amir Abolfazli, et al.
0

The success of deep reinforcement learning (DRL) hinges on the availability of training data, which is typically obtained via a large number of environment interactions. In many real-world scenarios, costs and risks are associated with gathering these data. The field of offline reinforcement learning addresses these issues through outsourcing the collection of data to a domain expert or a carefully monitored program and subsequently searching for a batch-constrained optimal policy. With the emergence of data markets, an alternative to constructing a dataset in-house is to purchase external data. However, while state-of-the-art offline reinforcement learning approaches have shown a lot of promise, they currently rely on carefully constructed datasets that are well aligned with the intended target domains. This raises questions regarding the transferability and robustness of an offline reinforcement learning agent trained on externally acquired data. In this paper, we empirically evaluate the ability of the current state-of-the-art offline reinforcement learning approaches to coping with the source-target domain mismatch within two MuJoCo environments, finding that current state-of-the-art offline reinforcement learning algorithms underperform in the target domain. To address this, we propose data valuation for offline reinforcement learning (DVORL), which allows us to identify relevant and high-quality transitions, improving the performance and transferability of policies learned by offline reinforcement learning algorithms. The results show that our method outperforms offline reinforcement learning baselines on two MuJoCo environments.

READ FULL TEXT
research
06/09/2022

Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

Offline reinforcement learning has shown great promise in leveraging lar...
research
11/02/2021

Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics

Offline reinforcement learning leverages large datasets to train policie...
research
02/13/2022

Goal Recognition as Reinforcement Learning

Most approaches for goal recognition rely on specifications of the possi...
research
01/27/2023

Behaviour Discriminator: A Simple Data Filtering Method to Improve Offline Policy Learning

This paper studies the problem of learning a control policy without the ...
research
11/15/2021

Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning

Offline reinforcement learning-learning a policy from a batch of data-is...
research
11/10/2019

A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Reinforcement learning is effective in optimizing policies for recommend...
research
08/14/2021

Offline-Online Reinforcement Learning for Energy Pricing in Office Demand Response: Lowering Energy and Data Costs

Our team is proposing to run a full-scale energy demand response experim...

Please sign up or login with your details

Forgot password? Click here to reset