Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints

01/06/2021
by   Tianhao Wang, et al.
37

We study reinforcement learning (RL) with linear function approximation under the adaptivity constraint. We consider two popular limited adaptivity models: batch learning model and rare policy switch model, and propose two efficient online RL algorithms for linear Markov decision processes. In specific, for the batch learning model, our proposed LSVI-UCB-Batch algorithm achieves an Õ(√(d^3H^3T) + dHT/B) regret, where d is the dimension of the feature mapping, H is the episode length, T is the number of interactions and B is the number of batches. Our result suggests that it suffices to use only √(T/dH) batches to obtain Õ(√(d^3H^3T)) regret. For the rare policy switch model, our proposed LSVI-UCB-RareSwitch algorithm enjoys an Õ(√(d^3H^3T[1+T/(dH)]^dH/B)) regret, which implies that dHlog T policy switches suffice to obtain the Õ(√(d^3H^3T)) regret. Our algorithms achieve the same regret as the LSVI-UCB algorithm (Jin et al., 2019), yet with a substantially smaller amount of adaptivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2023

A General Framework for Sequential Decision-Making under Adaptivity Constraints

We take the first step in studying general sequential decision-making un...
research
12/12/2022

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

We study reinforcement learning (RL) with linear function approximation....
research
11/19/2020

Online Model Selection for Reinforcement Learning with Function Approximation

Deep reinforcement learning has achieved impressive successes yet often ...
research
11/23/2020

Logarithmic Regret for Reinforcement Learning with Linear Function Approximation

Reinforcement learning (RL) with linear function approximation has recei...
research
01/06/2023

Provable Reset-free Reinforcement Learning by No-Regret Reduction

Real-world reinforcement learning (RL) is often severely limited since t...
research
06/23/2022

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

We study the constrained reinforcement learning problem, in which an age...
research
09/13/2020

Oracle-Efficient Reinforcement Learning in Factored MDPs with Unknown Structure

We consider provably-efficient reinforcement learning (RL) in non-episod...

Please sign up or login with your details

Forgot password? Click here to reset