Dance with Flow: Two-in-One Stream Action Detection

04/01/2019
by   Jiaojiao Zhao, et al.
8

The goal of this paper is to detect the spatio-temporal extent of an action. The two-stream detection network based on RGB and flow provides state-of-the-art accuracy at the expense of a large model-size and heavy computation. We propose to embed RGB and optical-flow into a single two-in-one stream network with new layers. A motion condition layer extracts motion information from flow images, which is leveraged by the motion modulation layer to generate transformation parameters for modulating the low-level RGB features. The method is easily embedded in existing appearance- or two-stream action detection networks, and trained end-to-end. Experiments demonstrate that leveraging the motion condition to modulate RGB features improves detection accuracy. With only half the computation and parameters of the state-of-the-art two-stream methods, our two-in-one stream still achieves impressive results on UCF101-24, UCFSports and J-HMDB.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

research
07/09/2021

RGB Stream Is Enough for Temporal Action Detection

State-of-the-art temporal action detectors to date are based on two-stre...
research
07/26/2018

Motion Feature Network: Fixed Motion Filter for Action Recognition

Spatio-temporal representations in frame sequences play an important rol...
research
06/05/2019

Two-Stream Region Convolutional 3D Network for Temporal Activity Detection

We address the problem of temporal activity detection in continuous, unt...
research
12/16/2021

Two Stream Network for Stroke Detection in Table Tennis

This paper presents a table tennis stroke detection method from videos. ...
research
12/30/2020

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

Light field data exhibit favorable characteristics conducive to saliency...
research
04/03/2020

Two-Stream AMTnet for Action Detection

In this paper, we propose Two-Stream AMTnet, which leverages recent adva...
research
10/20/2021

GTM: Gray Temporal Model for Video Recognition

Data input modality plays an important role in video action recognition....

Please sign up or login with your details

Forgot password? Click here to reset