Labelling unlabelled videos from scratch with multi-modal self-supervision

06/24/2020
by   Yuki M. Asano, et al.
0

A large part of the current success of deep learning lies in the effectiveness of data – more precisely: labelled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningful (pseudo-) labels for unlabelled datasets without supervision, this development is missing for the video domain where learning feature representations is the current focus. In this work, we a) show that unsupervised labelling of a video dataset does not come for free from strong feature encoders and b) propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations, by leveraging the natural correspondence between the audio and visual modalities. An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels. We further introduce the first benchmarking results on unsupervised labelling of common video datasets Kinetics, Kinetics-Sound, VGG-Sound and AVE.

READ FULL TEXT
research
07/06/2022

Domain Adaptive Video Segmentation via Temporal Pseudo Supervision

Video semantic segmentation has achieved great progress under the superv...
research
02/15/2023

Audio-Visual Contrastive Learning with Temporal Self-Supervision

We propose a self-supervised learning approach for videos that learns re...
research
02/15/2023

Unsupervised classification to improve the quality of a bird song recording dataset

Open audio databases such as Xeno-Canto are widely used to build dataset...
research
05/05/2021

Towards Self-Supervision for Video Identification of Individual Holstein-Friesian Cattle: The Cows2021 Dataset

In this paper we publish the largest identity-annotated Holstein-Friesia...
research
09/09/2016

Image and Video Mining through Online Learning

Within the field of image and video recognition, the traditional approac...
research
07/16/2021

Pseudo-labelling Enhanced Media Bias Detection

Leveraging unlabelled data through weak or distant supervision is a comp...
research
12/22/2019

Learning Improved Representations by Transferring Incomplete Evidence Across Heterogeneous Tasks

Acquiring ground truth labels for unlabelled data can be a costly proced...

Please sign up or login with your details

Forgot password? Click here to reset