Unsupervised Representation Learning by Sorting Sequences

08/03/2017
by   Hsin-Ying Lee, et al.
0

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take temporally shuffled frames (i.e., in non-chronological order) as inputs and train a convolutional neural network to sort the shuffled sequences. Similar to comparison-based sorting algorithms, we propose to extract features from all frame pairs and aggregate them to predict the correct order. As sorting shuffled image sequence requires an understanding of the statistical temporal structure of images, training with such a proxy task allows us to learn rich and generalizable visual representation. We validate the effectiveness of the learned representation using our method as pre-training on high-level recognition problems. The experimental results show that our method compares favorably against state-of-the-art methods on action recognition, image classification and object detection tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 8

research
12/01/2016

Object-Centric Representation Learning from Unlabeled Videos

Supervised (pre-)training currently yields state-of-the-art performance ...
research
03/28/2016

Shuffle and Learn: Unsupervised Learning using Temporal Order Verification

In this paper, we present an approach for learning a visual representati...
research
11/11/2017

End-to-end Video-level Representation Learning for Action Recognition

From the frame/clip-level feature learning to the video-level representa...
research
08/03/2020

SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning

A steady momentum of innovations and breakthroughs has convincingly push...
research
11/25/2017

Predictive Learning: Using Future Representation Learning Variantial Autoencoder for Human Action Prediction

The unsupervised Pretraining method has been widely used in aiding human...
research
08/26/2021

Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers

In this work, we consider the problem of sequence-to-sequence alignment ...
research
12/01/2016

Unsupervised learning of image motion by recomposing sequences

We propose a new method for learning a representation of image motion in...

Please sign up or login with your details

Forgot password? Click here to reset