RED: Reinforced Encoder-Decoder Networks for Action Anticipation

07/16/2017
by   Jiyang Gao, et al.
0

Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame's representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets.

READ FULL TEXT

page 2

page 4

research
12/10/2019

Forecasting Future Sequence of Actions to Complete an Activity

Future human action forecasting from partial observations of activities ...
research
12/05/2019

RED-NET: A Recursive Encoder-Decoder Network for Edge Detection

In this paper, we introduce RED-NET: A Recursive Encoder-Decoder Network...
research
10/12/2019

Residual Encoder-Decoder Network for Deep Subspace Clustering

Subspace clustering aims to cluster unlabeled data that lies in a union ...
research
03/05/2018

Less Is More: Picking Informative Frames for Video Captioning

In video captioning task, the best practice has been achieved by attenti...
research
10/12/2021

StARformer: Transformer with State-Action-Reward Representations

Reinforcement Learning (RL) can be considered as a sequence modeling tas...
research
02/16/2015

Unsupervised Learning of Video Representations using LSTMs

We use multilayer Long Short Term Memory (LSTM) networks to learn repres...
research
12/01/2017

Folded Recurrent Neural Networks for Future Video Prediction

Future video prediction is an ill-posed Computer Vision problem that rec...

Please sign up or login with your details

Forgot password? Click here to reset