Predicting Motivations of Actions by Leveraging Text

06/20/2014
by   Carl Vondrick, et al.
0

Understanding human actions is a key problem in computer vision. However, recognizing actions is only the first step of understanding what a person is doing. In this paper, we introduce the problem of predicting why a person has performed an action in images. This problem has many applications in human activity understanding, such as anticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions annotated with likely motivations. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their lifetime of experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.

READ FULL TEXT

page 1

page 8

research
02/04/2018

Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

We introduce the first benchmark for a new problem --- recognizing human...
research
08/29/2017

Modelling Protagonist Goals and Desires in First-Person Narrative

Many genres of natural language text are narratively structured, a testa...
research
04/17/2016

Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance

Understanding images with people often entails understanding their inter...
research
12/05/2022

Muscles in Action

Small differences in a person's motion can engage drastically different ...
research
12/17/2016

EgoTransfer: Transferring Motion Across Egocentric and Exocentric Domains using Deep Neural Networks

Mirror neurons have been observed in the primary motor cortex of primate...
research
01/10/2017

See the Glass Half Full: Reasoning about Liquid Containers, their Volume and Content

Humans have rich understanding of liquid containers and their contents; ...
research
04/22/2019

The Profiling Potential of Computer Vision and the Challenge of Computational Empiricism

Computer vision and other biometrics data science applications have comm...

Please sign up or login with your details

Forgot password? Click here to reset