A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration

09/09/2022
by   Xianqi Zhang, et al.
0

We pose a new question: Can agents learn how to combine actions from previous tasks to complete new tasks, just as humans? In contrast to imitation learning, there is no expert data, only the data collected through environmental exploration. Compared with offline reinforcement learning, the problem of data distribution shift is more serious. Since the action sequence to solve the new task may be the combination of trajectory segments of multiple training tasks, in other words, the test task and the solving strategy do not exist directly in the training data. This makes the problem more difficult. We propose a Memory-related Multi-task Method (M3) to address this problem. The method consists of three stages. First, task-agnostic exploration is carried out to collect data. Different from previous methods, we organize the exploration data into a knowledge graph. We design a model based on the exploration data to extract action effect features and save them in memory, while an action predictive model is trained. Secondly, for a new task, the action effect features stored in memory are used to generate candidate actions by a feature decomposition-based approach. Finally, a multi-scale candidate action pool and the action predictive model are fused to generate a strategy to complete the task. Experimental results show that the performance of our proposed method is significantly improved compared with the baseline.

READ FULL TEXT

page 1

page 11

research
05/26/2022

TempoRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning

Efficient exploration is a crucial challenge in deep reinforcement learn...
research
12/30/2022

Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks

Adversarial imitation learning (AIL) has become a popular alternative to...
research
10/20/2021

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences

Incorporating prior knowledge in reinforcement learning algorithms is ma...
research
06/11/2020

PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes

We consider the problem of batch multi-task reinforcement learning with ...
research
11/29/2016

Exploration for Multi-task Reinforcement Learning with Deep Generative Models

Exploration in multi-task reinforcement learning is critical in training...
research
06/19/2022

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

Many robotic tasks are composed of a lot of temporally correlated sub-ta...
research
04/17/2023

MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel Feed

Nowadays, the mainstream approach in position allocation system is to ut...

Please sign up or login with your details

Forgot password? Click here to reset