Budget-Aware Activity Detection with A Recurrent Policy Network

11/30/2017
by   Behrooz Mahasseni, et al.
0

In this paper, we address the challenging problem of effi- cient temporal activity detection in untrimmed long videos. While most recent work has focused and advanced the de- tection accuracy, the inference time can take seconds to minutes in processing one video, which is computationally prohibitive for many applications with tight runtime con- straints. This motivates our proposed budget-aware frame- work, which learns to perform activity detection by intel- ligently selecting a small subset of frames according to a specified time or computational budget. We formulate this problem as a Markov decision process, and adopt a recurrent network to model a policy for the frame selec- tion. To train the network, we employ a recurrent policy gradient approach to approximate the gradient of the non- decomposable and non-differentiable objective defined in our problem. In the extensive experiments on two bench- mark datasets, we achieve competitive detection accuracy, and more importantly, our approach is able to substantially reduce computational time and detect multiple activities in only 348ms for each untrimmed long video of THUMOS14 and ActivityNet.

READ FULL TEXT
research
07/26/2016

Approximate Policy Iteration for Budgeted Semantic Video Segmentation

This paper formulates and presents a solution to the new problem of budg...
research
11/22/2015

End-to-end Learning of Action Detection from Frame Glimpses in Videos

In this work we introduce a fully end-to-end approach for action detecti...
research
03/12/2020

ZSTAD: Zero-Shot Temporal Activity Detection

An integral part of video analysis and surveillance is temporal activity...
research
01/12/2022

OCSampler: Compressing Videos to One Clip with Single-step Sampling

In this paper, we propose a framework named OCSampler to explore a compa...
research
11/29/2018

AdaFrame: Adaptive Frame Selection for Fast Video Recognition

We present AdaFrame, a framework that adaptively selects relevant frames...
research
03/31/2023

Video text tracking for dense and small text based on pp-yoloe-r and sort algorithm

Although end-to-end video text spotting methods based on Transformer can...
research
11/26/2018

A Policy Gradient Method with Variance Reduction for Uplift Modeling

Uplift modeling aims to directly model the incremental impact of a treat...

Please sign up or login with your details

Forgot password? Click here to reset