Core Challenges in Embodied Vision-Language Planning

06/26/2021
by   Jonathan Francis, et al.
4

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2022

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

A long-term goal of AI research is to build intelligent agents that can ...
research
01/17/2022

A Literature Survey of Recent Advances in Chatbots

Chatbots are intelligent conversational computer systems designed to mim...
research
08/28/2023

Large Graph Models: A Perspective

Large models have emerged as the most recent groundbreaking achievements...
research
01/13/2022

Fantastic Data and How to Query Them

It is commonly acknowledged that the availability of the huge amount of ...
research
04/26/2022

Landing AI on Networks: An equipment vendor viewpoint on Autonomous Driving Networks

The tremendous achievements of Artificial Intelligence (AI) in computer ...
research
06/23/2015

A Survey of Current Datasets for Vision and Language Research

Integrating vision and language has long been a dream in work on artific...
research
11/10/2022

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Despite being responsible for state-of-the-art results in several comput...

Please sign up or login with your details

Forgot password? Click here to reset