Control with adaptive Q-learning

11/03/2020
by   João Pedro Araújo, et al.
0

This paper evaluates adaptive Q-learning (AQL) and single-partition adaptive Q-learning (SPAQL), two algorithms for efficient model-free episodic reinforcement learning (RL), in two classical control problems (Pendulum and Cartpole). AQL adaptively partitions the state-action space of a Markov decision process (MDP), while learning the control policy, i. e., the mapping from states to actions. The main difference between AQL and SPAQL is that the latter learns time-invariant policies, where the mapping from states to actions does not depend explicitly on the time step. This paper also proposes the SPAQL with terminal state (SPAQL-TS), an improved version of SPAQL tailored for the design of regulators for control problems. The time-invariant policies are shown to result in a better performance than the time-variant ones in both problems studied. These algorithms are particularly fitted to RL problems where the action space is finite, as is the case with the Cartpole problem. SPAQL-TS solves the OpenAI Gym Cartpole problem, while also displaying a higher sample efficiency than trust region policy optimization (TRPO), a standard RL algorithm for solving control tasks. Moreover, the policies learned by SPAQL are interpretable, while TRPO policies are typically encoded as neural networks, and therefore hard to interpret. Yielding interpretable policies while being sample-efficient are the major advantages of SPAQL.

READ FULL TEXT
research
07/14/2020

Single-partition adaptive Q-learning

This paper introduces single-partition adaptive Q-learning (SPAQL), an a...
research
09/02/2020

A reinforcement learning approach to hybrid control design

In this paper we design hybrid control policies for hybrid systems whose...
research
09/10/2020

A framework for reinforcement learning with autocorrelated actions

The subject of this paper is reinforcement learning. Policies are consid...
research
07/09/2020

On the Reliability and Generalizability of Brain-inspired Reinforcement Learning Algorithms

Although deep RL models have shown a great potential for solving various...
research
03/16/2022

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act

Traditionally, Reinforcement Learning (RL) aims at deciding how to act o...
research
08/02/2022

Implicit Two-Tower Policies

We present a new class of structured reinforcement learning policy-archi...
research
02/02/2019

Certified Reinforcement Learning with Logic Guidance

This paper proposes the first model-free Reinforcement Learning (RL) fra...

Please sign up or login with your details

Forgot password? Click here to reset