Multi-Level Compositional Reasoning for Interactive Instruction Following

08/18/2023
by   Suvaansh Bhambri, et al.
0

Robotic agents performing domestic chores by natural language directives are required to master the complex job of navigating environment and interacting with objects in the environments. The tasks given to the agents are often composite thus are challenging as completing them require to reason about multiple subtasks, e.g., bring a cup of coffee. To address the challenge, we propose to divide and conquer it by breaking the task into multiple subgoals and attend to them individually for better navigation and interaction. We call it Multi-level Compositional Reasoning Agent (MCR-Agent). Specifically, we learn a three-level action policy. At the highest level, we infer a sequence of human-interpretable subgoals to be executed based on language instructions by a high-level policy composition controller. At the middle level, we discriminatively control the agent's navigation by a master policy by alternating between a navigation policy and various independent interaction policies. Finally, at the lowest level, we infer manipulation actions with the corresponding object masks using the appropriate interaction policy. Our approach not only generates human interpretable subgoals but also achieves 2.03 (PLWSR in unseen set) without using rule-based planning or a semantic spatial memory.

READ FULL TEXT

page 2

page 3

page 10

page 16

page 17

page 18

page 19

page 20

research
08/14/2023

Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied Agents

Accomplishing household tasks requires to plan step-by-step actions cons...
research
03/02/2023

MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) aims to develop intelligent agents ...
research
10/01/2021

Improving Object Permanence using Agent Actions and Reasoning

Object permanence in psychology means knowing that objects still exist e...
research
10/26/2018

Neural Modular Control for Embodied Question Answering

We present a modular approach for learning policies for navigation over ...
research
11/15/2022

Structured Exploration Through Instruction Enhancement for Object Navigation

Finding an object of a specific class in an unseen environment remains a...
research
12/17/2022

Cascaded Compositional Residual Learning for Complex Interactive Behaviors

Real-world autonomous missions often require rich interaction with nearb...
research
07/07/2022

Hyper-Universal Policy Approximation: Learning to Generate Actions from a Single Image using Hypernets

Inspired by Gibson's notion of object affordances in human vision, we as...

Please sign up or login with your details

Forgot password? Click here to reset