Anticipating Visual Representations from Unlabeled Video

04/29/2015
by   Carl Vondrick, et al.
0

Anticipating actions and objects before they start or appear is a difficult problem in computer vision with several real-world applications. This task is challenging partly because it requires leveraging extensive knowledge of the world that is difficult to write down. We believe that a promising resource for efficiently learning this knowledge is through readily available unlabeled video. We present a framework that capitalizes on temporal structure in unlabeled video to learn to anticipate human actions and objects. The key idea behind our approach is that we can train deep networks to predict the visual representation of images in the future. Visual representations are a promising prediction target because they encode images at a higher semantic level than pixels yet are automatic to compute. We then apply recognition algorithms on our predicted representation to anticipate objects and actions. We experimentally validate this idea on two datasets, anticipating actions one second in the future and objects five seconds in the future.

READ FULL TEXT

page 5

page 7

research
01/01/2021

Learning the Predictability of the Future

We introduce a framework for learning from unlabeled video what is predi...
research
05/25/2016

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

While great strides have been made in using deep learning algorithms to ...
research
07/18/2014

Pixels to Voxels: Modeling Visual Representation in the Human Brain

The human brain is adept at solving difficult high-level visual processi...
research
05/20/2017

Forecasting Hands and Objects in Future Frames

This paper presents an approach to forecast future presence and location...
research
05/23/2017

Visual Semantic Planning using Deep Successor Representations

A crucial capability of real-world intelligent agents is their ability t...
research
03/27/2013

Utility-Based Control for Computer Vision

Several key issues arise in implementing computer vision recognition of ...
research
09/25/2018

Learning and Anticipating Future Actions During Exploratory Data Analysis

The goal of visual analytics is to create a symbiosis between human and ...

Please sign up or login with your details

Forgot password? Click here to reset