DeepAI AI Chat
Log In Sign Up

Adaptive Trade-Offs in Off-Policy Learning

10/16/2019
by   Mark Rowland, et al.
0

A great variety of off-policy learning algorithms exist in the literature, and new breakthroughs in this area continue to be made, improving theoretical understanding and yielding state-of-the-art reinforcement learning algorithms. In this paper, we take a unifying view of this space of algorithms, and consider their trade-offs of three fundamental quantities: update variance, fixed-point bias, and contraction rate. This leads to new perspectives of existing methods, and also naturally yields novel algorithms for off-policy evaluation and control. We develop one such algorithm, C-trace, demonstrating that it is able to more efficiently make these trade-offs than existing methods in use, and that it can be scaled to yield state-of-the-art performance in large-scale environments.

READ FULL TEXT
01/16/2013

Pivotal Pruning of Trade-offs in QPNs

Qualitative probabilistic networks have been designed for probabilistic ...
11/09/2019

The Bias-Expressivity Trade-off

Learning algorithms need bias to generalize and perform better than rand...
09/08/2021

On the Fundamental Trade-offs in Learning Invariant Representations

Many applications of representation learning, such as privacy-preservati...
06/15/2016

Natural Language Generation as Planning under Uncertainty Using Reinforcement Learning

We present and evaluate a new model for Natural Language Generation (NLG...
12/10/2019

Form + Function: Optimizing Aesthetic Product Design via Adaptive, Geometrized Preference Elicitation

Visual design is critical to product success, and the subject of intensi...
09/22/2021

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...
02/09/2018

The Fundamentals of Policy Crowdsourcing

What is the state of the research on crowdsourcing for policy making? Th...