YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection

02/14/2023
by   Jianhua Yang, et al.
0

Designing a real-time framework for the spatio-temporal action detection task is still a challenge. In this paper, we propose a novel real-time action detection framework, YOWOv2. In this new framework, YOWOv2 takes advantage of both the 3D backbone and 2D backbone for accurate action detection. A multi-level detection pipeline is designed to detect action instances of different scales. To achieve this goal, we carefully build a simple and efficient 2D backbone with a feature pyramid network to extract different levels of classification features and regression features. For the 3D backbone, we adopt the existing efficient 3D CNN to save development time. By combining 3D backbones and 2D backbones of different sizes, we design a YOWOv2 family including YOWOv2-Tiny, YOWOv2-Medium, and YOWOv2-Large. We also introduce the popular dynamic label assignment strategy and anchor-free mechanism to make the YOWOv2 consistent with the advanced model architecture design. With our improvement, YOWOv2 is significantly superior to YOWO, and can still keep real-time detection. Without any bells and whistles, YOWOv2 achieves 87.0 frame mAP and 52.8 YOWOv2 achieves 21.7 https://github.com/yjh0410/YOWOv2.

READ FULL TEXT

page 1

page 3

page 5

page 7

research
10/20/2022

YOWO-Plus: An Incremental Improvement

In this technical report, we would like to introduce our updates to YOWO...
research
08/07/2020

Multi-Level Temporal Pyramid Network for Action Detection

Currently, one-stage frameworks have been widely applied for temporal ac...
research
11/25/2016

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

We present a deep-learning framework for real-time multiple spatio-tempo...
research
06/29/2020

Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet

The automatic and objective medical diagnostic model can be valuable to ...
research
10/28/2016

Real-time Online Action Detection Forests using Spatio-temporal Contexts

Online action detection (OAD) is challenging since 1) robust yet computa...
research
05/23/2022

Real-time Collaborative Multi-Level Modeling by Conflict-Free Replicated Data Types

The need for real-time collaborative solutions in model-driven engineeri...
research
11/21/2014

Finding Action Tubes

We address the problem of action detection in videos. Driven by the late...

Please sign up or login with your details

Forgot password? Click here to reset