KORSAL: Key-point Detection based Online Real-Time Spatio-Temporal Action Localization

11/05/2021
by   Kalana Abeywardena, et al.
0

Real-time and online action localization in a video is a critical yet highly challenging problem. Accurate action localization requires the utilization of both temporal and spatial information. Recent attempts achieve this by using computationally intensive 3D CNN architectures or highly redundant two-stream architectures with optical flow, making them both unsuitable for real-time, online applications. To accomplish activity localization under highly challenging real-time constraints, we propose utilizing fast and efficient key-point based bounding box prediction to spatially localize actions. We then introduce a tube-linking algorithm that maintains the continuity of action tubes temporally in the presence of occlusions. Further, we eliminate the need for a two-stream architecture by combining temporal and spatial information into a cascaded input to a single network, allowing the network to learn from both types of information. Temporal information is efficiently extracted using a structural similarity index map as opposed to computationally intensive optical flow. Despite the simplicity of our approach, our lightweight end-to-end architecture achieves state-of-the-art frame-mAP of 74.7 challenging UCF101-24 dataset, demonstrating a performance gain of 6.4 the previous best online methods. We also achieve state-of-the-art video-mAP results compared to both online and offline methods. Moreover, our model achieves a frame rate of 41.8 FPS, which is a 10.7 contemporary real-time methods.

READ FULL TEXT

page 1

page 4

page 5

research
11/15/2019

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization

Spatiotemporal action localization requires incorporation of two sources...
research
06/28/2018

Modeling Spatio-Temporal Human Track Structure for Action Localization

This paper addresses spatio-temporal localization of human actions in vi...
research
04/03/2020

Two-Stream AMTnet for Action Detection

In this paper, we propose Two-Stream AMTnet, which leverages recent adva...
research
10/28/2016

Real-time Online Action Detection Forests using Spatio-temporal Contexts

Online action detection (OAD) is challenging since 1) robust yet computa...
research
06/07/2022

TadML: A fast temporal action detection with Mechanics-MLP

Temporal Action Detection(TAD) is a crucial but challenging task in vide...
research
02/19/2020

Feasibility of Video-based Sub-meter Localization on Resource-constrained Platforms

While the satellite-based Global Positioning System (GPS) is adequate fo...
research
06/16/2020

End-to-End Real-time Catheter Segmentation with Optical Flow-Guided Warping during Endovascular Intervention

Accurate real-time catheter segmentation is an important pre-requisite f...

Please sign up or login with your details

Forgot password? Click here to reset