Self-Supervised Representation Learning With MUlti-Segmental Informational Coding (MUSIC)

06/13/2022
by   Chuang Niu, et al.
6

Self-supervised representation learning maps high-dimensional data into a meaningful embedding space, where samples of similar semantic contents are close to each other. Most of the recent representation learning methods maximize cosine similarity or minimize the distance between the embedding features of different views from the same sample usually on the l2 normalized unit hypersphere. To prevent the trivial solutions that all samples have the same embedding feature, various techniques have been developed, such as contrastive learning, stop gradient, variance and covariance regularization, etc. In this study, we propose MUlti-Segmental Informational Coding (MUSIC) for self-supervised representation learning. MUSIC divides the embedding feature into multiple segments that discriminatively partition samples into different semantic clusters and different segments focus on different partition principles. Information theory measurements are directly used to optimize MUSIC and theoretically guarantee trivial solutions are avoided. MUSIC does not depend on commonly used techniques, such as memory bank or large batches, asymmetry networks, gradient stopping, momentum weight updating, etc, making the training framework flexible. Our experiments demonstrate that MUSIC achieves better results than most related Barlow Twins and VICReg methods on ImageNet classification with linear probing, and requires neither deep projectors nor large feature dimensions. Code will be made available.

READ FULL TEXT
research
12/05/2022

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

The deep learning community has witnessed an exponentially growing inter...
research
06/27/2019

Representation Learning of Music Using Artist, Album, and Track Information

Supervised music representation learning has been performed mainly using...
research
07/15/2022

HOME: High-Order Mixed-Moment-based Embedding for Representation Learning

Minimum redundancy among different elements of an embedding in a latent ...
research
10/19/2022

Training set cleansing of backdoor poisoning by self-supervised representation learning

A backdoor or Trojan attack is an important type of data poisoning attac...
research
07/07/2022

Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization

Although audio-visual representation has been proved to be applicable in...
research
11/02/2021

Multi-input Architecture and Disentangled Representation Learning for Multi-dimensional Modeling of Music Similarity

In the context of music information retrieval, similarity-based approach...
research
03/13/2021

Embedding Calibration for Music Semantic Similarity using Auto-regressive Transformer

One of the advantages of using natural language processing (NLP) technol...

Please sign up or login with your details

Forgot password? Click here to reset