AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition

02/10/2021
by   Yue Meng, et al.
0

Temporal modelling is the key for efficient video action recognition. While understanding temporal information can improve recognition accuracy for dynamic actions, removing temporal redundancy and reusing past features can significantly save computation leading to efficient action recognition. In this paper, we introduce an adaptive temporal fusion network, called AdaFuse, that dynamically fuses channels from current and past feature maps for strong temporal modelling. Specifically, the necessary information from the historical convolution feature maps is fused with current pruned feature maps with the goal of improving both recognition accuracy and efficiency. In addition, we use a skipping operation to further reduce the computation cost of action recognition. Extensive experiments on Something V1 V2, Jester and Mini-Kinetics show that our approach can achieve about 40 with comparable accuracy to state-of-the-art methods. The project page can be found at https://mengyuest.github.io/AdaFuse/

READ FULL TEXT

page 9

page 15

research
07/31/2020

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition

Action recognition is an open and challenging problem in computer vision...
research
10/22/2020

Learning to Sort Image Sequences via Accumulated Temporal Differences

Consider a set of n images of a scene with dynamic objects captured with...
research
11/22/2020

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

A key challenge in video enhancement and action recognition is to fuse u...
research
11/18/2022

Look More but Care Less in Video Recognition

Existing action recognition methods typically sample a few frames to rep...
research
09/12/2017

Learning Gating ConvNet for Two-Stream based Methods in Action Recognition

For the two-stream style methods in action recognition, fusing the two s...
research
02/15/2021

VA-RED^2: Video Adaptive Redundancy Reduction

Performing inference on deep learning models for videos remains a challe...
research
08/29/2014

Temporal Extension of Scale Pyramid and Spatial Pyramid Matching for Action Recognition

Historically, researchers in the field have spent a great deal of effort...

Please sign up or login with your details

Forgot password? Click here to reset