Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

09/22/2018
by   Meera Hahn, et al.
6

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating an activity. The sparse descriptions and ambiguity of written instructions create significant alignment challenges. The key to our approach is the use of egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object recognition and computational linguistic techniques. We obtain promising results on both the Extended GTEA Gaze+ dataset and the Bristol Egocentric Object Interactions Dataset.

READ FULL TEXT

page 3

page 5

page 9

research
03/27/2023

Fine-grained Audible Video Description

We explore a new task for audio-visual-language modeling called fine-gra...
research
05/19/2020

A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks

Many high-level procedural tasks can be decomposed into sequences of ins...
research
03/05/2015

What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision

We present a novel method for aligning a sequence of instructions to a v...
research
02/13/2023

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation

Vision-Language Navigation (VLN) is a challenging task which requires an...
research
05/20/2023

Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

Text-video retrieval is a challenging cross-modal task, which aims to al...
research
11/26/2015

TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos

Automatically describing videos has ever been fascinating. In this work,...

Please sign up or login with your details

Forgot password? Click here to reset