Actor and Observer: Joint Modeling of First and Third-Person Videos

04/25/2018
by   Gunnar A. Sigurdsson, et al.
2

Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor). Despite this, learning such models for human action recognition has not been achievable due to the lack of data. This paper takes a step in this direction, with the introduction of Charades-Ego, a large-scale dataset of paired first-person and third-person videos, involving 112 people, with 4000 paired videos. This enables learning the link between the two, actor and observer perspectives. Thereby, we address one of the biggest bottlenecks facing egocentric vision research, providing a link from first-person to the abundant third-person data on the web. We use this data to learn a joint representation of first and third-person videos, with only weak supervision, and show its effectiveness for transferring knowledge from the third-person to the first-person domain.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
04/25/2018

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

In Actor and Observer we introduced a dataset linking the first and thir...
research
09/21/2022

FT-HID: A Large Scale RGB-D Dataset for First and Third Person Human Interaction Analysis

Analysis of human interaction is one important research topic of human m...
research
04/01/2023

DOAD: Decoupled One Stage Action Detection Network

Localizing people and recognizing their actions from videos is a challen...
research
11/30/2017

Future Person Localization in First-Person Videos

We present a new task that predicts future locations of people observed ...
research
04/16/2021

Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos

We introduce an approach for pre-training egocentric video models using ...
research
08/06/2022

Study of detecting behavioral signatures within DeepFake videos

There is strong interest in the generation of synthetic video imagery of...
research
11/29/2016

Social Behavior Prediction from First Person Videos

This paper presents a method to predict the future movements (location a...

Please sign up or login with your details

Forgot password? Click here to reset