Lightweight Network Architecture for Real-Time Action Recognition

05/21/2019
by   Alexander Kozlov, et al.
0

In this work we present a new efficient approach to Human Action Recognition called Video Transformer Network (VTN). It leverages the latest advances in Computer Vision and Natural Language Processing and applies them to video understanding. The proposed method allows us to create lightweight CNN models that achieve high accuracy and real-time speed using just an RGB mono camera and general purpose CPU. Furthermore, we explain how to improve accuracy by distilling from multiple models with different modalities into a single model. We conduct a comparison with state-of-the-art methods and show that our approach performs on par with most of them on famous Action Recognition datasets. We benchmark the inference time of the models using the modern inference framework and argue that our approach compares favorably with other methods in terms of speed/accuracy trade-off, running at 56 FPS on CPU. The models and the training code are available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2021

LIGAR: Lightweight General-purpose Action Recognition

Growing amount of different practical tasks in a video understanding pro...
research
06/17/2019

Towards Real-Time Action Recognition on Mobile Devices Using Deep Models

Action recognition is a vital task in computer vision, and many methods ...
research
11/18/2021

Evaluating Transformers for Lightweight Action Recognition

In video action recognition, transformers consistently reach state-of-th...
research
06/17/2020

A Real-time Action Representation with Temporal Encoding and Deep Compression

Deep neural networks have achieved remarkable success for video-based ac...
research
02/01/2021

Video Transformer Network

This paper presents VTN, a transformer-based framework for video recogni...
research
05/24/2023

High Speed Human Action Recognition using a Photonic Reservoir Computer

The recognition of human actions in videos is one of the most active res...
research
11/19/2019

Action Recognition Using Volumetric Motion Representations

Traditional action recognition models are constructed around the paradig...

Please sign up or login with your details

Forgot password? Click here to reset