Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters

05/26/2016
by   AJ Piergiovanni, et al.
0

In this paper, we newly introduce the concept of temporal attention filters, and describe how they can be used for human activity recognition from videos. Many high-level activities are often composed of multiple temporal parts (e.g., sub-events) with different duration/speed, and our objective is to make the model explicitly learn such temporal structure using multiple attention filters and benefit from them. Our temporal filters are designed to be fully differentiable, allowing end-of-end training of the temporal filters together with the underlying frame-based or segment-based convolutional neural network architectures. This paper presents an approach of learning a set of optimal static temporal attention filters to be shared across different videos, and extends this approach to dynamically adjust attention filters per testing video using recurrent long short-term memory networks (LSTMs). This allows our temporal attention filters to learn latent sub-events specific to each activity. We experimentally confirm that the proposed concept of temporal attention filters benefits the activity recognition, and we visualize the learned latent sub-events.

READ FULL TEXT

page 4

page 7

research
12/05/2017

Learning Latent Super-Events to Detect Multiple Activities in Videos

In this paper, we introduce the concept of learning latent super-events ...
research
03/16/2018

Activity Detection with Latent Sub-event Hierarchy Learning

In this paper, we introduce a new convolutional layer named the Temporal...
research
05/19/2018

On Attention Models for Human Activity Recognition

Most approaches that model time-series data in human activity recognitio...
research
04/08/2018

Learning-based Video Motion Magnification

Video motion magnification techniques allow us to see small motions prev...
research
03/12/2021

Learning spectro-temporal representations of complex sounds with parameterized neural networks

Deep Learning models have become potential candidates for auditory neuro...
research
03/03/2017

Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

We design a new approach that allows robot learning of new activities fr...
research
04/10/2017

CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

This work is about recognizing human activities occurring in videos at d...

Please sign up or login with your details

Forgot password? Click here to reset