Learning Group Activities from Skeletons without Individual Action Labels

05/14/2021
by   Fabio Zappardino, et al.
0

To understand human behavior we must not just recognize individual actions but model possibly complex group activity and interactions. Hierarchical models obtain the best results in group activity recognition but require fine grained individual action annotations at the actor level. In this paper we show that using only skeletal data we can train a state-of-the art end-to-end system using only group activity labels at the sequence level. Our experiments show that models trained without individual action supervision perform poorly. On the other hand we show that pseudo-labels can be computed from any pre-trained feature extractor with comparable final performance. Finally our carefully designed lean pose only architecture shows highly competitive results versus more complex multimodal approaches even in the self-supervised variant.

READ FULL TEXT

page 1

page 2

page 4

research
07/09/2016

Hierarchical Deep Temporal Models for Group Activity Recognition

In this paper we present an approach for classifying the activity perfor...
research
08/09/2021

Pose is all you need: The pose only group activity recognition system (POGARS)

We introduce a novel deep learning based group activity recognition appr...
research
07/29/2021

Fine-Grained Classroom Activity Detection from Audio with Neural Networks

Instructors are increasingly incorporating student-centered learning tec...
research
03/28/2020

Actor-Transformers for Group Activity Recognition

This paper strives to recognize individual actions and group activities ...
research
12/18/2018

Multi-Level Sequence GAN for Group Activity Recognition

We propose a novel semi-supervised, Multi-Level Sequential Generative Ad...
research
10/24/2020

Improved Actor Relation Graph based Group Activity Recognition

Video understanding is to recognize and classify different actions or ac...
research
05/14/2023

Is end-to-end learning enough for fitness activity recognition?

End-to-end learning has taken hold of many computer vision tasks, in par...

Please sign up or login with your details

Forgot password? Click here to reset