Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset

06/24/2017
by   Seil Na, et al.
0

YouTube-8M is the largest video dataset for multi-label video classification. In order to tackle the multi-label classification on this challenging dataset, it is necessary to solve several issues such as temporal modeling of videos, label imbalances, and correlations between labels. We develop a deep neural network model, which consists of four components: the frame encoder, the classification layer, the label processing layer, and the loss function. We introduce our newly proposed methods and discusses how existing models operate in the YouTube-8M Classification Task, what insights they have, and why they succeed (or fail) to achieve good performance. Most of the models we proposed are very high compared to the baseline models, and the ensemble of the models we used is 8th in the Kaggle Competition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2018

Approach for Video Classification with Multi-label on YouTube-8M Dataset

Video traffic is increasing at a considerable rate due to the spread of ...
research
07/05/2017

Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification

We report on CMU Informedia Lab's system used in Google's YouTube 8 Mill...
research
07/13/2017

Large-scale Video Classification guided by Batch Normalized LSTM Translator

Youtube-8M dataset enhances the development of large-scale video recogni...
research
06/28/2017

The YouTube-8M Kaggle Competition: Challenges and Methods

We took part in the YouTube-8M Video Understanding Challenge hosted on K...
research
02/18/2018

Structured Label Inference for Visual Understanding

Visual data such as images and videos contain a rich source of structure...
research
09/27/2016

YouTube-8M: A Large-Scale Video Classification Benchmark

Many recent advancements in Computer Vision are attributed to large data...
research
06/15/2017

Hierarchical Label Inference for Video Classification

Videos are a rich source of high-dimensional structured data, with a wid...

Please sign up or login with your details

Forgot password? Click here to reset