LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

12/03/2019
by   Zuxuan Wu, et al.
9

This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios. Exploiting decent yet computationally efficient features derived at a coarse scale with a lightweight CNN model, LiteEval dynamically decides on-the-fly whether to compute more powerful features for incoming video frames at a finer scale to obtain more details. This is achieved by a coarse LSTM and a fine LSTM operating cooperatively, as well as a conditional gating module to learn when to allocate more computation. Extensive experiments are conducted on two large-scale video benchmarks, FCVID and ActivityNet, and the results demonstrate LiteEval requires substantially less computation while offering excellent classification accuracy for both online and offline predictions.

READ FULL TEXT

page 2

page 7

research
12/29/2020

2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition

3D convolutional networks are prevalent for video recognition. While ach...
research
04/20/2021

HMS: Hierarchical Modality Selection for Efficient Video Recognition

Videos are multimodal in nature. Conventional video recognition pipeline...
research
04/27/2021

FrameExit: Conditional Early Exiting for Efficient Video Recognition

In this paper, we propose a conditional early exiting framework for effi...
research
09/26/2018

A Coarse-To-Fine Framework For Video Object Segmentation

In this study, we develop an unsupervised coarse-to-fine video analysis ...
research
02/15/2022

Enhancing Deformable Convolution based Video Frame Interpolation with Coarse-to-fine 3D CNN

This paper presents a new deformable convolution-based video frame inter...
research
12/21/2018

Cascaded Coarse-to-Fine Deep Kernel Networks for Efficient Satellite Image Change Detection

Deep networks are nowadays becoming popular in many computer vision and ...
research
09/05/2023

Hierarchical Masked 3D Diffusion Model for Video Outpainting

Video outpainting aims to adequately complete missing areas at the edges...

Please sign up or login with your details

Forgot password? Click here to reset