The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge

06/16/2017
by   He-Da Wang, et al.
0

This article describes the final solution of team monkeytyping, who finished in second place in the YouTube-8M video understanding challenge. The dataset used in this challenge is a large-scale benchmark for multi-label video classification. We extend the work in [1] and propose several improvements for frame sequence modeling. We propose a network structure called Chaining that can better capture the interactions between labels. Also, we report our approaches in dealing with multi-scale information and attention pooling. In addition, We find that using the output of model ensemble as a side target in training can boost single model performance. We report our experiments in bagging, boosting, cascade, and stacking, and propose a stacking algorithm called attention weighted stacking. Our final submission is an ensemble that consists of 74 sub models, all of which are listed in the appendix.

READ FULL TEXT
research
07/11/2017

Hierarchical Deep Recurrent Architecture for Video Understanding

This paper introduces the system we developed for the Youtube-8M Video U...
research
11/15/2019

Multi-attention Networks for Temporal Localization of Video-level Labels

Temporal localization remains an important challenge in video understand...
research
06/20/2023

Multi-Scale Occ: 4th Place Solution for CVPR 2023 3D Occupancy Prediction Challenge

In this report, we present the 4th place solution for CVPR 2023 3D occup...
research
09/30/2020

Understanding Twitter Engagement with a Click-Through Rate-based Method

This paper presents the POLINKS solution to the RecSys Challenge 2020 th...
research
08/21/2018

Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge

This paper presents our 10th place solution to the second YouTube-8M vid...
research
07/05/2017

Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification

We report on CMU Informedia Lab's system used in Google's YouTube 8 Mill...
research
09/12/2018

Label Denoising with Large Ensembles of Heterogeneous Neural Networks

Despite recent advances in computer vision based on various convolutiona...

Please sign up or login with your details

Forgot password? Click here to reset