Deep Q-learning: a robust control approach

01/21/2022
by   Balázs Varga, et al.
0

In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control. We formulate an uncertain linear time-invariant model by means of the neural tangent kernel to describe learning. We show the instability of learning and analyze the agent's behavior in frequency-domain. Then, we ensure convergence via robust controllers acting as dynamical rewards in the loss function. We synthesize three controllers: state-feedback gain scheduling ℋ_2, dynamic ℋ_∞, and constant gain ℋ_∞ controllers. Setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature compared to the heuristics in reinforcement learning. In addition, our approach does not use a target network and randomized replay memory. The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer). Numerical simulations in different OpenAI Gym environments suggest that the ℋ_∞ controlled learning performs slightly better than Double deep Q-learning.

READ FULL TEXT

page 1

page 15

research
03/26/2022

Robust Fuzzy Q-Learning-Based Strictly Negative Imaginary Tracking Controllers for the Uncertain Quadrotor Systems

Quadrotors are one of the popular unmanned aerial vehicles (UAVs) due to...
research
10/18/2017

The Effects of Memory Replay in Reinforcement Learning

Experience replay is a key technique behind many recent advances in deep...
research
11/26/2020

Reinforcement Learning for Robust Missile Autopilot Design

Designing missiles' autopilot controllers has been a complex task, given...
research
04/29/2020

Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

We present a set of model-free, reduced-dimensional reinforcement learni...
research
12/02/2022

STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning

Deep Reinforcement Learning (DRL) has the potential to be used for synth...
research
08/16/2003

Controlled hierarchical filtering: Model of neocortical sensory processing

A model of sensory information processing is presented. The model assume...
research
11/03/2022

Sensor Control for Information Gain in Dynamic, Sparse and Partially Observed Environments

We present an approach for autonomous sensor control for information gat...

Please sign up or login with your details

Forgot password? Click here to reset