How Well Self-Supervised Pre-Training Performs with Streaming Data?

04/25/2021
by   Dapeng Hu, et al.
11

The common self-supervised pre-training practice requires collecting massive unlabeled data together and then trains a representation model, dubbed joint training. However, in real-world scenarios where data are collected in a streaming fashion, the joint training scheme is usually storage-heavy and time-consuming. A more efficient alternative is to train a model continually with streaming data, dubbed sequential training. Nevertheless, it is unclear how well sequential self-supervised pre-training performs with streaming data. In this paper, we conduct thorough experiments to investigate self-supervised pre-training with streaming data. Specifically, we evaluate the transfer performance of sequential self-supervised pre-training with four different data sequences on three different downstream tasks and make comparisons with joint self-supervised pre-training. Surprisingly, we find sequential self-supervised learning exhibits almost the same performance as the joint training when the distribution shifts within streaming data are mild. Even for data sequences with large distribution shifts, sequential self-supervised training with simple techniques, e.g., parameter regularization or data replay, still performs comparably to joint training. Based on our findings, we recommend using sequential self-supervised training as a more efficient yet performance-competitive representation learning practice for real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2022

SEPT: Towards Scalable and Efficient Visual Pre-Training

Recently, the self-supervised pre-training paradigm has shown great pote...
research
07/01/2020

A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks

Deep neural networks are typically trained under a supervised learning f...
research
09/30/2022

Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio

Self-supervised learning (SSL) has proven vital in speech and audio-rela...
research
03/22/2023

Correlational Image Modeling for Self-Supervised Visual Pre-Training

We introduce Correlational Image Modeling (CIM), a novel and surprisingl...
research
08/07/2023

Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods

Visual pre-training with large-scale real-world data has made great prog...
research
06/11/2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute

Self-supervised learning (SSL) has led to great strides in speech proces...
research
05/16/2023

Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors

The recently-developed infant wearable MAIJU provides a means to automat...

Please sign up or login with your details

Forgot password? Click here to reset