EgoTV: Egocentric Task Verification from Natural Language Task Descriptions

03/29/2023
by   Rishi Hazra, et al.
0

To enable progress towards egocentric agents capable of understanding everyday tasks specified in natural language, we propose a benchmark and a synthetic dataset called Egocentric Task Verification (EgoTV). EgoTV contains multi-step tasks with multiple sub-task decompositions, state changes, object interactions, and sub-task ordering constraints, in addition to abstracted task descriptions that contain only partial details about ways to accomplish a task. We also propose a novel Neuro-Symbolic Grounding (NSG) approach to enable the causal, temporal, and compositional reasoning of such tasks. We demonstrate NSG's capability towards task tracking and verification on our EgoTV dataset and a real-world dataset derived from CrossTask (CTV). Our contributions include the release of the EgoTV and CTV datasets, and the NSG model for future research on egocentric assistive agents.

READ FULL TEXT

page 2

page 15

page 16

page 18

page 19

research
09/11/2023

Multi3DRefer: Grounding Text Description to Multiple 3D Objects

We introduce the task of localizing a flexible number of objects in real...
research
12/04/2019

Compositional Temporal Visual Grounding of Natural Language Event Descriptions

Temporal grounding entails establishing a correspondence between natural...
research
01/12/2021

CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions

Natural Language (NL) descriptions can be the most convenient or the onl...
research
03/24/2022

Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning

Temporal grounding in videos aims to localize one target video segment t...
research
06/03/2021

SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain

In the pursuit of natural language understanding, there has been a long ...
research
10/16/2021

Learning to Solve Complex Tasks by Talking to Agents

Humans often solve complex problems by interacting (in natural language)...
research
01/24/2023

Language-guided Task Adaptation for Imitation Learning

We introduce a novel setting, wherein an agent needs to learn a task fro...

Please sign up or login with your details

Forgot password? Click here to reset