Provable General Function Class Representation Learning in Multitask Bandits and MDPs

05/31/2022
by   Rui Lu, et al.
4

While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an agent playing M contextual bandits (or MDPs) concurrently and extracting a shared representation function ϕ from a specific function class Φ using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.

READ FULL TEXT
research
06/15/2021

On the Power of Multitask Representation Learning in Linear MDP

While multitask representation learning has become a popular approach in...
research
02/08/2021

Near-optimal Representation Learning for Linear Bandits and Linear RL

This paper studies representation learning for multi-task linear bandits...
research
06/13/2022

Provable Benefit of Multitask Representation Learning in Reinforcement Learning

As representation learning becomes a powerful technique to reduce sample...
research
06/24/2022

Joint Representation Training in Sequential Tasks with Shared Structure

Classical theory in reinforcement learning (RL) predominantly focuses on...
research
12/19/2022

On the Complexity of Representation Learning in Contextual Linear Bandits

In contextual linear bandits, the reward function is assumed to be a lin...
research
02/29/2020

Learning Near Optimal Policies with Low Inherent Bellman Error

We study the exploration problem with approximate linear action-value fu...
research
03/02/2022

Learning Efficiently Function Approximation for Contextual MDP

We study learning contextual MDPs using a function approximation for bot...

Please sign up or login with your details

Forgot password? Click here to reset