Commands 4 Autonomous Vehicles (C4AV) Workshop Summary

09/18/2020
by   Thierry Deruyttere, et al.
10

The task of visual grounding requires locating the most relevant region or object in an image, given a natural language query. So far, progress on this task was mostly measured on curated datasets, which are not always representative of human spoken language. In this work, we deviate from recent, popular task settings and consider the problem under an autonomous vehicle scenario. In particular, we consider a situation where passengers can give free-form natural language commands to a vehicle which can be associated with an object in the street scene. To stimulate research on this topic, we have organized the Commands for Autonomous Vehicles (C4AV) challenge based on the recent Talk2Car dataset (URL: https://www.aicrowd.com/challenges/eccv-2020-commands-4-autonomous-vehicles). This paper presents the results of the challenge. First, we compare the used benchmark against existing datasets for visual grounding. Second, we identify the aspects that render top-performing models successful, and relate them to existing state-of-the-art models for visual grounding, in addition to detecting potential failure cases by evaluating on carefully selected subsets. Finally, we discuss several possibilities for future work.

READ FULL TEXT

page 2

page 6

page 20

page 21

page 22

page 23

page 24

page 25

research
03/14/2022

Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention

Grounding a command to the visual environment is an essential ingredient...
research
12/24/2021

Grounding Linguistic Commands to Navigable Regions

Humans have a natural ability to effortlessly comprehend linguistic comm...
research
09/24/2019

Talk2Car: Taking Control of Your Self-Driving Car

A long-term goal of artificial intelligence is to have an agent execute ...
research
09/08/2023

Three Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

3D visual grounding is the task of localizing the object in a 3D scene w...
research
01/26/2023

Evaluating the acceptance of autonomous vehicles in the future

The continuous advance of the automotive industry is leading to the emer...
research
05/04/2020

Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions

Visual referring expression recognition is a challenging task that requi...
research
12/14/2018

Conversational Intent Understanding for Passengers in Autonomous Vehicles

Understanding passenger intents and extracting relevant slots are import...

Please sign up or login with your details

Forgot password? Click here to reset