A2: Extracting Cyclic Switchings from DOB-nets for Rejecting Excessive Disturbances

11/01/2019
by   Wenjie Lu, et al.
0

Reinforcement Learning (RL) is limited in practice by its gray-box nature, which is responsible for insufficient trustiness from users, unsatisfied interpretation for human intervention, inadequate analysis for future improvement, etc. This paper seeks to partially characterize the interplay between dynamical environments and the DOB-net. The DOB-net obtained from RL solves a set of Partially Observable Markovian Decision Processes (POMDPs). The transition function of each POMDP is largely determined by the environments, which are excessive external disturbances in this research. This paper proposes an Attention-based Abstraction (A^2) approach to extract a finite-state automaton, referred to as a Key Moore Machine Network (KMMN), to capture the switching mechanisms exhibited by the DOB-net in dealing with multiple such POMDPs. This approach first quantizes the controlled platform by learning continuous-discrete interfaces. Then it extracts the KMMN by finding the key hidden states and transitions that attract sufficient attention from the DOB-net. Within the resultant KMMN, this study found three patterns of cyclic switchings (between key hidden states), showing controls near their saturation are synchronized with unknown disturbances. Interestingly, the found switching mechanism has appeared previously in the design of hybrid control for often-saturated systems. It is further interpreted via an analogy to the discrete-event subsystem in the hybrid control.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2015

Recurrent Reinforcement Learning: A Hybrid Approach

Successful applications of reinforcement learning in real-world problems...
research
05/15/2001

Market-Based Reinforcement Learning in Partially Observable Worlds

Unlike traditional reinforcement learning (RL), market-based RL is in pr...
research
11/06/2022

On learning history based policies for controlling Markov decision processes

Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapprox...
research
04/19/2022

When Is Partially Observable Reinforcement Learning Not Scary?

Applications of Reinforcement Learning (RL), in which agents learn to ma...
research
07/10/2019

DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances

This paper presents an observer-integrated Reinforcement Learning (RL) a...
research
02/08/2023

Near-Optimal Adversarial Reinforcement Learning with Switching Costs

Switching costs, which capture the costs for changing policies, are rega...

Please sign up or login with your details

Forgot password? Click here to reset