Long-term Temporal Convolutions for Action Recognition

04/15/2016
by   Gul Varol, et al.
0

Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7

READ FULL TEXT

page 2

page 11

page 16

research
10/01/2013

Combining Spatio-Temporal Appearance Descriptors and Optical Flow for Human Action Recognition in Video Data

This paper proposes combining spatio-temporal appearance (STA) descripto...
research
05/08/2018

Low-Latency Human Action Recognition with Weighted Multi-Region Convolutional Neural Network

Spatio-temporal contexts are crucial in understanding human actions in v...
research
12/02/2021

Event Neural Networks

Video data is often repetitive; for example, the content of adjacent fra...
research
03/20/2020

Fully Automated Hand Hygiene Monitoring in Operating Room using 3D Convolutional Neural Network

Hand hygiene is one of the most significant factors in preventing hospit...
research
06/17/2019

A Temporal Sequence Learning for Action Recognition and Prediction

In this work[This work was supported in part by the National Science Fou...
research
09/02/2014

Action Recognition in the Frequency Domain

In this paper, we describe a simple strategy for mitigating variability ...
research
06/01/2020

Temporal Aggregate Representations for Long Term Video Understanding

Future prediction requires reasoning from current and past observations ...

Please sign up or login with your details

Forgot password? Click here to reset