Multi-Level Temporal Pyramid Network for Action Detection

08/07/2020
by   Xiang Wang, et al.
0

Currently, one-stage frameworks have been widely applied for temporal action detection, but they still suffer from the challenge that the action instances span a wide range of time. The reason is that these one-stage detectors, e.g., Single Shot Multi-Box Detector (SSD), extract temporal features only applying a single-level layer for each head, which is not discriminative enough to perform classification and regression. In this paper, we propose a Multi-Level Temporal Pyramid Network (MLTPN) to improve the discrimination of the features. Specially, we first fuse the features from multiple layers with different temporal resolutions, to encode multi-layer temporal information. We then apply a multi-level feature pyramid architecture on the features to enhance their discriminative abilities. Finally, we design a simple yet effective feature fusion module to fuse the multi-level multi-scale features. By this means, the proposed MLTPN can learn rich and discriminative features for different action instances with different durations. We evaluate MLTPN on two challenging datasets: THUMOS'14 and Activitynet v1.3, and the experimental results show that MLTPN obtains competitive performance on Activitynet v1.3 and outperforms the state-of-the-art approaches on THUMOS'14 significantly.

READ FULL TEXT
research
11/12/2018

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network

Feature pyramids are widely exploited by both the state-of-the-art one-s...
research
02/14/2023

YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection

Designing a real-time framework for the spatio-temporal action detection...
research
11/05/2020

Fast Object Detection with Latticed Multi-Scale Feature Fusion

Scale variance is one of the crucial challenges in multi-scale object de...
research
12/07/2021

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

Action detection is an essential and challenging task, especially for de...
research
01/15/2022

Multi-level Second-order Few-shot Learning

We propose a Multi-level Second-order (MlSo) few-shot learning network f...
research
04/07/2020

Temporal Pyramid Network for Action Recognition

Visual tempo characterizes the dynamics and the temporal scale of an act...
research
04/30/2021

MOOD: Multi-level Out-of-distribution Detection

Out-of-distribution (OOD) detection is essential to prevent anomalous in...

Please sign up or login with your details

Forgot password? Click here to reset