Action Class Relation Detection and Classification Across Multiple Video Datasets

08/15/2023
by   Yuya Yoshikawa, et al.
0

The Meta Video Dataset (MetaVD) provides annotated relations between action classes in major datasets for human action recognition in videos. Although these annotated relations enable dataset augmentation, it is only applicable to those covered by MetaVD. For an external dataset to enjoy the same benefit, the relations between its action classes and those in MetaVD need to be determined. To address this issue, we consider two new machine learning tasks: action class relation detection and classification. We propose a unified model to predict relations between action classes, using language and visual information associated with classes. Experimental results show that (i) pre-trained recent neural network models for texts and videos contribute to high predictive performance, (ii) the relation prediction based on action label texts is more accurate than based on videos, and (iii) a blending approach that combines predictions by both modalities can further improve the predictive performance in some cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

On the Performance Evaluation of Action Recognition Models on Transcoded Low Quality Videos

In the design of action recognition models, the quality of videos in the...
research
07/15/2019

A Short Note on the Kinetics-700 Human Action Dataset

We describe an extension of the DeepMind Kinetics human action dataset f...
research
05/22/2017

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

The paucity of videos in current action classification datasets (UCF-101...
research
06/09/2022

The Missing Link: Finding label relations across datasets

Computer Vision is driven by the many datasets which can be used for tra...
research
12/15/2020

Towards Improving Spatiotemporal Action Recognition in Videos

Spatiotemporal action recognition deals with locating and classifying ac...
research
05/26/2023

CVB: A Video Dataset of Cattle Visual Behaviors

Existing image/video datasets for cattle behavior recognition are mostly...
research
04/02/2023

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Action understanding matters and attracts attention. It can be formed as...

Please sign up or login with your details

Forgot password? Click here to reset