Information-Theoretic Confidence Bounds for Reinforcement Learning

11/21/2019
by   Xiuyuan Lu, et al.
0

We integrate information-theoretic concepts into the design and analysis of optimistic algorithms and Thompson sampling. By making a connection between information-theoretic quantities and confidence bounds, we obtain results that relate the per-period performance of the agent with its information gain about the environment, thus explicitly characterizing the exploration-exploitation tradeoff. The resulting cumulative regret bound depends on the agent's uncertainty over the environment and quantifies the value of prior information. We show applicability of this approach to several environments, including linear bandits, tabular MDPs, and factored MDPs. These examples demonstrate the potential of a general information-theoretic approach for the design and analysis of reinforcement learning algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2022

An Information-Theoretic Analysis of Bayesian Reinforcement Learning

Building on the framework introduced by Xu and Raginksy [1] for supervis...
research
06/09/2022

Regret Bounds for Information-Directed Reinforcement Learning

Information-directed sampling (IDS) has revealed its potential as a data...
research
06/21/2019

Robustness of Dynamical Quantities of Interest via Goal-Oriented Information Theory

Variational-principle-based methods that relate expectations of a quanti...
research
05/13/2021

Intelligence and Unambitiousness Using Algorithmic Information Theory

Algorithmic Information Theory has inspired intractable constructions of...
research
04/22/2020

An information-theoretic approach to the analysis of location and co-location patterns

We propose a statistical framework to quantify location and co-location ...
research
09/30/2021

Reinforcement Learning with Information-Theoretic Actuation

Reinforcement Learning formalises an embodied agent's interaction with t...
research
06/05/2023

Learning Embeddings for Sequential Tasks Using Population of Agents

We present an information-theoretic framework to learn fixed-dimensional...

Please sign up or login with your details

Forgot password? Click here to reset