Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

07/10/2023
by   Ruiqi Zhang, et al.
0

In some applications of reinforcement learning, a dataset of pre-collected experience is already available but it is also possible to acquire some additional online data to help improve the quality of the policy. However, it may be preferable to gather additional data with a single, non-reactive exploration policy and avoid the engineering costs associated with switching policies. In this paper we propose an algorithm with provable guarantees that can leverage an offline dataset to design a single non-reactive policy for exploration. We theoretically analyze the algorithm and measure the quality of the final policy as a function of the local coverage of the original dataset and the amount of additional data collected.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

Behavior Constraining in Weight Space for Offline Reinforcement Learning

In offline reinforcement learning, a policy needs to be learned from a s...
research
07/21/2021

Design of Experiments for Stochastic Contextual Linear Bandits

In the stochastic linear contextual bandit setting there exist several m...
research
06/16/2020

Accelerating Online Reinforcement Learning with Offline Datasets

Reinforcement learning provides an appealing formalism for learning cont...
research
06/10/2023

HIPODE: Enhancing Offline Reinforcement Learning with High-Quality Synthetic Data from a Policy-Decoupled Approach

Offline reinforcement learning (ORL) has gained attention as a means of ...
research
11/07/2018

Policy Certificates: Towards Accountable Reinforcement Learning

The performance of a reinforcement learning algorithm can vary drastical...
research
07/12/2022

Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

In lifelong learning, an agent learns throughout its entire life without...
research
01/30/2023

Designing an offline reinforcement learning objective from scratch

Offline reinforcement learning has developed rapidly over the recent yea...

Please sign up or login with your details

Forgot password? Click here to reset