Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning

04/06/2021
by   Pramod Chunduri, et al.
0

Detection and localization of actions in videos is an important problem in practice. A traffic analyst might be interested in studying the patterns in which vehicles move at a given intersection. State-of-the-art video analytics systems are unable to efficiently and effectively answer such action queries. The reasons are threefold. First, action detection and localization tasks require computationally expensive deep neural networks. Second, actions are often rare events. Third, actions are spread across a sequence of frames. It is important to take the entire sequence of frames into context for effectively answering the query. It is critical to quickly skim through the irrelevant parts of the video to answer the action query efficiently. In this paper, we present Zeus, a video analytics system tailored for answering action queries. We propose a novel technique for efficiently answering these queries using a deep reinforcement learning agent. Zeus trains an agent that learns to adaptively modify the input video segments to an action classification network. The agent alters the input segments along three dimensions – sampling rate, segment length, and resolution. Besides efficiency, Zeus is capable of answering the query at a user-specified target accuracy using a query optimizer that trains the agent based on an accuracy-aware reward function. Our evaluation of Zeus on a novel action localization dataset shows that it outperforms the state-of-the-art frame- and window-based techniques by up to 1.4x and 3x, respectively. Furthermore, unlike the frame-based technique, it satisfies the user-specified target accuracy across all the queries, at up to 2x higher accuracy, than frame-based methods.

READ FULL TEXT

page 1

page 3

page 5

research
04/15/2020

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

Summarizing video content is an important task in many applications. Thi...
research
02/16/2021

THIA: Accelerating Video Analytics using Early Inference and Fine-Grained Query Planning

To efficiently process visual data at scale, researchers have proposed t...
research
05/07/2023

Video-Specific Query-Key Attention Modeling for Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization aims to identify and loca...
research
03/07/2017

NoScope: Optimizing Neural Network Queries over Video at Scale

Recent advances in computer vision-in the form of deep neural networks-h...
research
05/20/2022

Action parsing using context features

We propose an action parsing algorithm to parse a video sequence contain...
research
06/28/2019

Localizing Unseen Activities in Video via Image Query

Action localization in untrimmed videos is an important topic in the fie...
research
04/04/2021

EKO: Adaptive Sampling of Compressed Video Data

Researchers have presented systems for efficiently analysing video data ...

Please sign up or login with your details

Forgot password? Click here to reset