Detecting events and key actors in multi-person videos

11/09/2015
by   Vignesh Ramanathan, et al.
0

Multi-person event recognition is a challenging task, often with many people active in the scene but only a small subset contributing to an actual event. In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event. Our model does not use explicit annotations regarding who or where those people are during training and testing. In particular, we track people in videos and use a recurrent neural network (RNN) to represent the track features. We learn time-varying attention weights to combine these features at each time-instant. The attended features are then processed using another RNN for event detection/classification. Since most video datasets with multiple people are restricted to a small number of videos, we also collected a new basketball dataset comprising 257 basketball games with 14K event annotations corresponding to 11 event classes. Our model outperforms state-of-the-art methods for both event classification and detection on this new dataset. Additionally, we show that the attention mechanism is able to consistently localize the relevant players.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 9

research
02/24/2021

SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter

The ability to automatically detect stuttering events in speech could he...
research
05/30/2021

Towards Diverse Paragraph Captioning for Untrimmed Videos

Video paragraph captioning aims to describe multiple events in untrimmed...
research
12/05/2017

Learning Latent Super-Events to Detect Multiple Activities in Videos

In this paper, we introduce the concept of learning latent super-events ...
research
03/16/2019

Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos

In multi-person videos, especially team sport videos, a semantic event i...
research
04/07/2019

Learning to Learn Relation for Important People Detection in Still Images

Humans can easily recognize the importance of people in social event ima...
research
04/13/2020

Event detection in coarsely annotated sports videos via parallel multi receptive field 1D convolutions

In problems such as sports video analytics, it is difficult to obtain ac...
research
06/06/2022

People Tracking in Panoramic Video for Guiding Robots

A guiding robot aims to effectively bring people to and from specific pl...

Please sign up or login with your details

Forgot password? Click here to reset