VideoMix: Rethinking Data Augmentation for Video Classification

12/07/2020
by   Sangdoo Yun, et al.
0

State-of-the-art video action classifiers often suffer from overfitting. They tend to be biased towards specific objects and scene cues, rather than the foreground action content, leading to sub-optimal generalization performances. Recent data augmentation strategies have been reported to address the overfitting problems in static image classifiers. Despite the effectiveness on the static image classifiers, data augmentation has rarely been studied for videos. For the first time in the field, we systematically analyze the efficacy of various data augmentation strategies on the video classification task. We then propose a powerful augmentation strategy VideoMix. VideoMix creates a new training video by inserting a video cuboid into another video. The ground truth labels are mixed proportionally to the number of voxels from each video. We show that VideoMix lets a model learn beyond the object and scene biases and extract more robust cues for action recognition. VideoMix consistently outperforms other augmentation baselines on Kinetics and the challenging Something-Something-V2 benchmarks. It also improves the weakly-supervised action localization performance on THUMOS'14. VideoMix pretrained models exhibit improved accuracies on the video detection task (AVA).

READ FULL TEXT

page 5

page 14

page 15

research
11/23/2022

Evaluating and Mitigating Static Bias of Action Representations in the Background and the Foreground

Deep neural networks for video action recognition easily learn to utiliz...
research
06/30/2022

Exploring Temporally Dynamic Data Augmentation for Video Recognition

Data augmentation has recently emerged as an essential component of mode...
research
03/30/2021

Learning Representational Invariances for Data-Efficient Action Recognition

Data augmentation is a ubiquitous technique for improving image classifi...
research
06/09/2022

Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition

We address the problem of data augmentation for video action recognition...
research
09/18/2023

Selective Volume Mixup for Video Action Recognition

The recent advances in Convolutional Neural Networks (CNNs) and Vision T...
research
05/13/2019

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Regional dropout strategies have been proposed to enhance the performanc...
research
10/14/2021

Nuisance-Label Supervision: Robustness Improvement by Free Labels

In this paper, we present a Nuisance-label Supervision (NLS) module, whi...

Please sign up or login with your details

Forgot password? Click here to reset