Unifying task specification in reinforcement learning

09/07/2016
by   Martha White, et al.
0

Reinforcement learning tasks are typically specified as Markov decision processes. This formalism has been highly successful, though specifications often couple the dynamics of the environment and the learning objective. This lack of modularity can complicate generalization of the task specification, as well as obfuscate connections between different task settings, such as episodic and continuing. In this work, we introduce the RL task formalism, that provides a unification through simple constructs including a generalization to transition-based discounting. Through a series of examples, we demonstrate the generality and utility of this formalism. Finally, we extend standard learning constructs, including Bellman operators, and extend some seminal theoretical results, including approximation errors bounds. Overall, we provide a well-understood and sound formalism on which to build theoretical results and simplify algorithm use and development.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2017

Environment-Independent Task Specifications via GLTL

We propose a new task-specification language for Markov decision process...
research
05/24/2019

Rethinking Expected Cumulative Reward Formalism of Reinforcement Learning: A Micro-Objective Perspective

The standard reinforcement learning (RL) formulation considers the expec...
research
02/02/2021

Metrics and continuity in reinforcement learning

In most practical applications of reinforcement learning, it is untenabl...
research
11/15/2021

The Partially Observable History Process

We introduce the partially observable history process (POHP) formalism f...
research
09/05/2018

Reinforcement Learning under Threats

In several reinforcement learning (RL) scenarios, mainly in security set...
research
09/09/2023

Verifiable Reinforcement Learning Systems via Compositionality

We propose a framework for verifiable and compositional reinforcement le...
research
01/12/2012

Sparse Reward Processes

We introduce a class of learning problems where the agent is presented w...

Please sign up or login with your details

Forgot password? Click here to reset