Large-Scale Video Classification with Feature Space Augmentation coupled with Learned Label Relations and Ensembling

09/21/2018
by   Choongyeun Cho, et al.
0

This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733 the model size constraint), and 87.287 requirement. Two sets of 7 individual models belonging to 3 different families were trained separately. Then, the inference results on a training data were aggregated from these multiple models and fed to train a compact model that meets the model size requirement. In order to further improve performance we explored and employed data over/sub-sampling in feature space, an additional regularization term during training exploiting label relationship, and learned weights for ensembling different individual models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2018

Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge

This paper presents our 10th place solution to the second YouTube-8M vid...
research
10/25/2022

TabMixer: Excavating Label Distribution Learning with Small-scale Features

Label distribution learning (LDL) differs from multi-label learning whic...
research
09/29/2018

Non-local NetVLAD Encoding for Video Classification

This paper describes our solution for the 2^nd YouTube-8M video understa...
research
07/04/2017

Aggregating Frame-level Features for Large-Scale Video Classification

This paper introduces the system we developed for the Google Cloud & You...
research
12/02/2019

BERT for Large-scale Video Segment Classification with Test-time Augmentation

This paper presents our approach to the third YouTube-8M video understan...
research
06/15/2017

Hierarchical Label Inference for Video Classification

Videos are a rich source of high-dimensional structured data, with a wid...

Please sign up or login with your details

Forgot password? Click here to reset