Unsupervised Learning from Video with Deep Neural Embeddings

05/28/2019
by   Chengxu Zhuang, et al.
0

Because of the rich dynamical structure of videos and their ubiquity in everyday life, it is a natural idea that video data could serve as a powerful unsupervised learning signal for training visual representations in deep neural networks. However, instantiating this idea, especially at large scale, has remained a significant artificial intelligence challenge. Here we present the Video Instance Embedding (VIE) framework, which extends powerful recent unsupervised loss functions for learning deep nonlinear embeddings to multi-stream temporal processing architectures on large-scale video datasets. We show that VIE-trained networks substantially advance the state of the art in unsupervised learning from video datastreams, both for action recognition in the Kinetics dataset, and object recognition in the ImageNet dataset. We show that a hybrid model with both static and dynamic processing pathways is optimal for both transfer tasks, and provide analyses indicating how the pathways differ. Taken in context, our results suggest that deep neural embeddings are a promising approach to unsupervised visual learning across a wide variety of domains.

READ FULL TEXT
research
03/29/2019

Local Aggregation for Unsupervised Learning of Visual Embeddings

Unsupervised approaches to learning in neural networks are of substantia...
research
05/25/2016

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

While great strides have been made in using deep learning algorithms to ...
research
03/17/2017

Learning Robust Visual-Semantic Embeddings

Many of the existing methods for learning joint embedding of images and ...
research
02/11/2015

Large-Scale Deep Learning on the YFCC100M Dataset

We present a work-in-progress snapshot of learning with a 15 billion par...
research
09/12/2014

Unsupervised learning of clutter-resistant visual representations from natural videos

Populations of neurons in inferotemporal cortex (IT) maintain an explici...
research
07/18/2018

Multi-Task Unsupervised Contextual Learning for Behavioral Annotation

Unsupervised learning has been an attractive method for easily deriving ...
research
11/19/2015

Unsupervised Learning of Visual Structure using Predictive Generative Networks

The ability to predict future states of the environment is a central pil...

Please sign up or login with your details

Forgot password? Click here to reset