DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

11/22/2019
by   Mohammadhosein Hasanbeig, et al.
0

We propose a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires the attainment of an unknown sequence of high-level objectives. Our method employs a recently-published algorithm for synthesis of compact automata to uncover this sequential structure. We synthesise an automaton from trace data generated through exploration of the environment by the deep RL agent. A product construction is then used to enrich the state space of the environment so that generation of an optimal control policy by deep RL is guided by the discovered structure encoded in the automaton. Our experiments show that our method is able to achieve training results that are otherwise difficult with state-of-the-art RL techniques unaided by external guidance.

READ FULL TEXT
11/22/2019

DeepSynth: Program Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

We propose a method for efficient training of deep Reinforcement Learnin...
10/24/2020

Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search

Local policy search is performed by most Deep Reinforcement Learning (D-...
09/06/2021

Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning

Designing optimal reward functions has been desired but extremely diffic...
09/08/2020

Induction and Exploitation of Subgoal Automata for Reinforcement Learning

In this paper we present ISA, an approach for learning and exploiting su...
11/25/2019

A Deep Reinforcement Learning Architecture for Multi-stage Optimal Control

Deep reinforcement learning for high dimensional, hierarchical control t...
10/05/2021

Deep reinforcement learning for guidewire navigation in coronary artery phantom

In percutaneous intervention for treatment of coronary plaques, guidewir...
05/14/2022

PrefixRL: Optimization of Parallel Prefix Circuits using Deep Reinforcement Learning

In this work, we present a reinforcement learning (RL) based approach to...