Revisiting EmbodiedQA: A Simple Baseline and Beyond

04/08/2019
by   Yu Wu, et al.
1

In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for current approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that can be end-to-end optimized by SGD; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In the new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation.

READ FULL TEXT

page 1

page 5

page 8

research
11/12/2018

Blindfold Baselines for Embodied QA

We explore blindfold (question-only) baselines for Embodied Question Ans...
research
12/07/2015

Simple Baseline for Visual Question Answering

We describe a very simple bag-of-words baseline for visual question answ...
research
12/16/2021

Explanation as Question Answering based on Design Knowledge

Explanation of an AI agent requires knowledge of its design and operatio...
research
04/24/2021

Ask Explore: Grounded Question Answering for Curiosity-Driven Exploration

In many real-world scenarios where extrinsic rewards to the agent are ex...
research
09/28/2020

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

Vision-and-Language Navigation (VLN) is a natural language grounding tas...
research
07/04/2021

End-to-end Neural Coreference Resolution Revisited: A Simple yet Effective Baseline

Since the first end-to-end neural coreference resolution model was intro...
research
10/31/2022

Learning to Navigate Wikipedia by Taking Random Walks

A fundamental ability of an intelligent web-based agent is seeking out a...

Please sign up or login with your details

Forgot password? Click here to reset