Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

06/16/2018
by   Wenhan Xiong, et al.
0

We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs. In contrast to existing methods which start with learning from demonstrations (LfD) and then use reinforcement learning (RL) to fine-tune the model parameters, we propose a novel policy optimization algorithm which dynamically schedules demonstration learning and RL. The proposed training paradigm provides efficient exploration and better generalization beyond existing methods. Comparing to existing ensemble models, the best single model based on our proposed method tremendously decreases the execution error by over 50 block-world environment. To further illustrate the exploration strategy of our RL algorithm, We also include systematic studies on the evolution of policy entropy during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2023

Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation

Natural Language-conditioned reinforcement learning (RL) enables the age...
research
04/05/2022

Jump-Start Reinforcement Learning

Reinforcement learning (RL) provides a theoretical framework for continu...
research
10/24/2020

Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search

Local policy search is performed by most Deep Reinforcement Learning (D-...
research
11/01/2022

Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions

The adoption of pre-trained language models to generate action plans for...
research
01/18/2021

Interpretable Policy Specification and Synthesis through Natural Language and RL

Policy specification is a process by which a human can initialize a robo...
research
07/18/2023

REX: Rapid Exploration and eXploitation for AI Agents

In this paper, we propose an enhanced approach for Rapid Exploration and...
research
08/26/2022

Play with Emotion: Affect-Driven Reinforcement Learning

This paper introduces a paradigm shift by viewing the task of affect mod...

Please sign up or login with your details

Forgot password? Click here to reset