Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks

09/27/2018
by   Mohammad Jubran, et al.
0

Advanced video classification systems decode video frames to derive the necessary texture and motion representations for ingestion and analysis by spatio-temporal deep convolutional neural networks (CNNs). However, when considering visual Internet-of-Things applications, surveillance systems and semantic crawlers of large video repositories, the video capture and the CNN-based semantic analysis parts do not tend to be co-located. This necessitates the transport of compressed video over networks and incurs significant overhead in bandwidth and energy consumption, thereby significantly undermining the deployment potential of such systems. In this paper, we investigate the trade-off between the encoding bitrate and the achievable accuracy of CNN-based video classification models that directly ingest AVC/H.264 and HEVC encoded videos. Instead of retaining entire compressed video bitstreams and applying complex optical flow calculations prior to CNN processing, we only retain motion vector and select texture information at significantly-reduced bitrates and apply no additional processing prior to CNN ingestion. Based on three CNN architectures and two action recognition datasets, we achieve 11 classification accuracy. A model-based selection between multiple CNNs increases these savings further, to the point where, if up to 7 accuracy can be tolerated, video classification can take place with as little as 3 kbps for the transport of the required compressed video information to the system implementing the CNN models.

READ FULL TEXT
research
10/14/2017

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

We investigate video classification via a two-stream convolutional neura...
research
05/08/2018

Visual Attribute-augmented Three-dimensional Convolutional Neural Network for Enhanced Human Action Recognition

Visual attributes in individual video frames, such as the presence of ch...
research
08/31/2016

Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification

The video and action classification have extremely evolved by deep neura...
research
02/15/2019

TMAV: Temporal Motionless Analysis of Video using CNN in MPSoC

Analyzing video for traffic categorization is an important pillar of Int...
research
03/15/2019

SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection

Various convolutional neural networks (CNNs) were developed recently tha...
research
09/29/2022

Speeding Up Action Recognition Using Dynamic Accumulation of Residuals in Compressed Domain

With the widespread use of installed cameras, video-based monitoring app...
research
03/11/2019

Demonstration of Vector Flow Imaging using Convolutional Neural Networks

Synthetic Aperture Vector Flow Imaging (SA-VFI) can visualize complex ca...

Please sign up or login with your details

Forgot password? Click here to reset