Reinforcement Learning

05/29/2020
by   Olivier Buffer, et al.
0

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e.g., board games, video games or autonomous vehicles. In such problems, an agent faces a sequential decision-making problem where, at every time step, it observes its state, performs an action, receives a reward and moves to a new state. An RL agent learns by trial and error a good policy (or controller) based on observations and numeric reward feedback on the previously performed action. In this chapter, we present the basic framework of RL and recall the two main families of approaches that have been developed to learn a good policy. The first one, which is value-based, consists in estimating the value of an optimal policy, value from which a policy can be recovered, while the other, called policy search, directly works in a policy space. Actor-critic methods can be seen as a policy search technique where the policy value that is learned guides the policy improvement. Besides, we give an overview of some extensions of the standard RL framework, notably when risk-averse behavior needs to be taken into account or when rewards are not available or not known.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2021

Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm

Learning optimal behavior from existing data is one of the most importan...
research
05/26/2020

Active Measure Reinforcement Learning for Observation Cost Minimization

Standard reinforcement learning (RL) algorithms assume that the observat...
research
09/07/2018

Improving On-policy Learning with Statistical Reward Accumulation

Deep reinforcement learning has obtained significant breakthroughs in re...
research
04/06/2017

Geometry of Policy Improvement

We investigate the geometry of optimal memoryless time independent decis...
research
02/15/2022

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

This paper proposes a new regularization technique for reinforcement lea...
research
09/29/2018

Reinforcement Learning in R

Reinforcement learning refers to a group of methods from artificial inte...
research
12/02/2020

Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks

In this paper, we consider jointly optimizing cell load balance and netw...

Please sign up or login with your details

Forgot password? Click here to reset