Two Stream LSTM: A Deep Fusion Framework for Human Action Recognition

04/04/2017
by   Harshala Gammulle, et al.
0

In this paper we address the problem of human action recognition from video sequences. Inspired by the exemplary results obtained via automatic feature learning and deep learning approaches in computer vision, we focus our attention towards learning salient spatial features via a convolutional neural network (CNN) and then map their temporal relationship with the aid of Long-Short-Term-Memory (LSTM) networks. Our contribution in this paper is a deep fusion framework that more effectively exploits spatial features from CNNs with temporal features from LSTM models. We also extensively evaluate their strengths and weaknesses. We find that by combining both the sets of features, the fully connected features effectively act as an attention mechanism to direct the LSTM to interesting parts of the convolutional feature sequence. The significance of our fusion method is its simplicity and effectiveness compared to other state-of-the-art methods. The evaluation results demonstrate that this hierarchical multi stream fusion method has higher performance compared to single stream mapping methods allowing it to achieve high accuracy outperforming current state-of-the-art methods in three widely used databases: UCF11, UCFSports, jHMDB.

READ FULL TEXT

page 2

page 3

page 6

page 7

page 8

research
06/05/2022

3D Convolutional with Attention for Action Recognition

Human action recognition is one of the challenging tasks in computer vis...
research
07/06/2017

Skeleton-based Action Recognition Using LSTM and CNN

Recent methods based on 3D skeleton data have achieved outstanding perfo...
research
11/16/2016

Joint Network based Attention for Action Recognition

By extracting spatial and temporal characteristics in one network, the t...
research
03/13/2022

Context-LSTM: a robust classifier for video detection on UCF101

Video detection and human action recognition may be computationally expe...
research
04/02/2018

Deep Spatiotemporal Models for Robust Proprioceptive Terrain Classification

Terrain classification is a critical component of any autonomous mobile ...
research
03/23/2017

Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification

This paper proposes a novel deep learning framework named bidirectional-...

Please sign up or login with your details

Forgot password? Click here to reset