Learning Environment Models with Continuous Stochastic Dynamics

06/29/2023
by   Martin Tappler, et al.
0

Solving control tasks in complex environments automatically through learning offers great potential. While contemporary techniques from deep reinforcement learning (DRL) provide effective solutions, their decision-making is not transparent. We aim to provide insights into the decisions faced by the agent by learning an automaton model of environmental behavior under the control of an agent. However, for most control problems, automata learning is not scalable enough to learn a useful model. In this work, we raise the capabilities of automata learning such that it is possible to learn models for environments that have complex and continuous dynamics. The core of the scalability of our method lies in the computation of an abstract state-space representation, by applying dimensionality reduction and clustering on the observed environmental state space. The stochastic transitions are learned via passive automata learning from observed interactions of the agent and the environment. In an iterative model-based RL process, we sample additional trajectories to learn an accurate environment model in the form of a discrete-state Markov decision process (MDP). We apply our automata learning framework on popular RL benchmarking environments in the OpenAI Gym, including LunarLander, CartPole, Mountain Car, and Acrobot. Our results show that the learned models are so precise that they enable the computation of policies solving the respective control tasks. Yet the models are more concise and more general than neural-network-based policies and by using MDPs we benefit from a wealth of tools available for analyzing them. When solving the task of LunarLander, the learned model even achieved similar or higher rewards than deep RL policies learned with stable-baselines3.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2022

Reinforcement Learning under Partial Observability Guided by Learned Environment Models

In practical applications, we can rarely assume full observability of a ...
research
07/09/2020

On the Reliability and Generalizability of Brain-inspired Reinforcement Learning Algorithms

Although deep RL models have shown a great potential for solving various...
research
12/04/2022

Automata Learning meets Shielding

Safety is still one of the major research challenges in reinforcement le...
research
02/19/2023

Stochastic Generative Flow Networks

Generative Flow Networks (or GFlowNets for short) are a family of probab...
research
06/22/2016

Visualizing Dynamics: from t-SNE to SEMI-MDPs

Deep Reinforcement Learning (DRL) is a trending field of research, showi...
research
09/08/2020

Induction and Exploitation of Subgoal Automata for Reinforcement Learning

In this paper we present ISA, an approach for learning and exploiting su...

Please sign up or login with your details

Forgot password? Click here to reset