Weakly-Supervised Multi-Person Action Recognition in 360^∘ Videos

02/09/2020
by   Junnan Li, et al.
0

The recent development of commodity 360^∘ cameras have enabled a single video to capture an entire scene, which endows promising potentials in surveillance scenarios. However, research in omnidirectional video analysis has lagged behind the hardware advances. In this work, we address the important problem of action recognition in top-view 360^∘ videos. Due to the wide filed-of-view, 360^∘ videos usually capture multiple people performing actions at the same time. Furthermore, the appearance of people are deformed. The proposed framework first transforms omnidirectional videos into panoramic videos, then it extracts spatial-temporal features using region-based 3D CNNs for action recognition. We propose a weakly-supervised method based on multi-instance multi-label learning, which trains the model to recognize and localize multiple actions in a video using only video-level action labels as supervision. We perform experiments to quantitatively validate the efficacy of the proposed method and qualitatively demonstrate action localization results. To enable research in this direction, we introduce 360Action, the first omnidirectional video dataset for multi-person action recognition.

READ FULL TEXT

page 3

page 4

page 5

page 8

research
07/21/2020

Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos

Despite the recent advances in video classification, progress in spatio-...
research
07/27/2017

Learning from Video and Text via Large-Scale Discriminative Clustering

Discriminative clustering has been successfully applied to a number of w...
research
10/10/2017

Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN

When we say a person is texting, can you tell the person is walking or s...
research
05/11/2022

Video-ReTime: Learning Temporally Varying Speediness for Time Remapping

We propose a method for generating a temporally remapped video that matc...
research
09/18/2019

Multiple Human Tracking using Multi-Cues including Primitive Action Features

In this paper, we propose a Multiple Human Tracking method using multi-c...
research
01/21/2021

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

Since collecting and annotating data for spatio-temporal action detectio...
research
11/27/2019

AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization

The point process is a solid framework to model sequential data, such as...

Please sign up or login with your details

Forgot password? Click here to reset