Feature diversity in self-supervised learning

09/02/2022
by   Pranshu Malviya, et al.
8

Many studies on scaling laws consider basic factors such as model size, model shape, dataset size, and compute power. These factors are easily tunable and represent the fundamental elements of any machine learning setup. But researchers have also employed more complex factors to estimate the test error and generalization performance with high predictability. These factors are generally specific to the domain or application. For example, feature diversity was primarily used for promoting syn-to-real transfer by Chen et al. (2021). With numerous scaling factors defined in previous works, it would be interesting to investigate how these factors may affect overall generalization performance in the context of self-supervised learning with CNN models. How do individual factors promote generalization, which includes varying depth, width, or the number of training epochs with early stopping? For example, does higher feature diversity result in higher accuracy held in complex settings other than a syn-to-real transfer? How do these factors depend on each other? We found that the last layer is the most diversified throughout the training. However, while the model's test error decreases with increasing epochs, its diversity drops. We also discovered that diversity is directly related to model width.

READ FULL TEXT
research
11/02/2021

Procedural Generalization by Planning with Self-Supervised World Models

One of the key promises of model-based reinforcement learning is the abi...
research
11/19/2020

Robot Gaining Accurate Pouring Skills through Self-Supervised Learning and Generalization

Pouring is one of the most commonly executed tasks in humans' daily live...
research
04/19/2022

Diverse Imagenet Models Transfer Better

A commonly accepted hypothesis is that models with higher accuracy on Im...
research
03/23/2021

Revisiting Self-Supervised Monocular Depth Estimation

Self-supervised learning of depth map prediction and motion estimation f...
research
06/23/2023

Variance-Covariance Regularization Improves Representation Learning

Transfer learning has emerged as a key approach in the machine learning ...
research
07/14/2022

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models

Self-supervised learning (SSL) is seen as a very promising approach with...
research
12/23/2019

The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity

Algorithm performance in supervised learning is a combination of memoriz...

Please sign up or login with your details

Forgot password? Click here to reset