WhyAct: Identifying Action Reasons in Lifestyle Vlogs

09/06/2021
by   Oana Ignat, et al.
5

We aim to automatically identify human action reasons in online videos. We focus on the widespread genre of lifestyle vlogs, in which people perform actions while verbally describing them. We introduce and make publicly available the WhyAct dataset, consisting of 1,077 visual actions manually annotated with their reasons. We describe a multimodal model that leverages visual and textual information to automatically infer the reasons corresponding to an action presented in the video.

READ FULL TEXT

page 2

page 7

page 13

page 14

page 15

page 16

research
06/10/2019

Identifying Visible Actions in Lifestyle Vlogs

We consider the task of identifying human actions visible in online vide...
research
09/12/2023

Human Action Co-occurrence in Lifestyle Vlogs using Graph Link Prediction

We introduce the task of automatic human action co-occurrence identifica...
research
05/02/2020

A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

Procedural knowledge, which we define as concrete information about the ...
research
04/21/2016

Online Action Detection

In online action detection, the goal is to detect the start of an action...
research
03/27/2023

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

The goal of this work is to understand the way actions are performed in ...
research
03/10/2020

Video Caption Dataset for Describing Human Actions in Japanese

In recent years, automatic video caption generation has attracted consid...
research
06/07/2015

Describing Common Human Visual Actions in Images

Which common human actions and interactions are recognizable in monocula...

Please sign up or login with your details

Forgot password? Click here to reset