Forecasting Hands and Objects in Future Frames

05/20/2017
by   Chenyou Fan, et al.
0

This paper presents an approach to forecast future presence and location of human hands and objects. Given an image frame, the goal is to predict what objects will appear in the future frame (e.g., 5 seconds later) and where they will be located at, even when they are not visible in the current frame. The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene information in its frame and that (2) we can predict (i.e., regress) such representations corresponding to the future frames based on that of the current frame. We design a new two-stream convolutional neural network (CNN) architecture for videos by extending the state-of-the-art convolutional object detection network, and present a new fully convolutional regression network for predicting future scene representations. Our experiments confirm that combining the regressed future representation with our detection network allows reliable estimation of future hands and objects in videos. We obtain much higher accuracy compared to the state-of-the-art future object presence forecast method on a public dataset.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 7

page 9

03/03/2017

Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

We design a new approach that allows robot learning of new activities fr...
06/16/2021

Unsupervised Video Prediction from a Single Frame by Estimating 3D Dynamic Scene Structure

Our goal in this work is to generate realistic videos given just one ini...
04/10/2017

ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

Object detection in wide area motion imagery (WAMI) has drawn the attent...
04/24/2019

Segmenting the Future

Predicting the future is an important aspect for decision-making in robo...
04/29/2015

Anticipating Visual Representations from Unlabeled Video

Anticipating actions and objects before they start or appear is a diffic...
07/26/2019

A Fully-Convolutional Neural Network for Background Subtraction of Unseen Videos

Background subtraction is a basic task in computer vision and video proc...
11/13/2020

Adaptive Future Frame Prediction with Ensemble Network

Future frame prediction in videos is a challenging problem because video...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.