Latent Correlation-Based Multiview Learning and Self-Supervision: A Unifying Perspective

06/14/2021
by   Qi Lyu, et al.
0

Multiple views of data, both naturally acquired (e.g., image and audio) and artificially produced (e.g., via adding different noise to data samples), have proven useful in enhancing representation learning. Natural views are often handled by multiview analysis tools, e.g., (deep) canonical correlation analysis [(D)CCA], while the artificial ones are frequently used in self-supervised learning (SSL) paradigms, e.g., SimCLR and Barlow Twins. Both types of approaches often involve learning neural feature extractors such that the embeddings of data exhibit high cross-view correlations. Although intuitive, the effectiveness of correlation-based neural embedding is only empirically validated. This work puts forth a theory-backed framework for unsupervised multiview learning. Our development starts with proposing a multiview model, where each view is a nonlinear mixture of shared and private components. Consequently, the learning problem boils down to shared/private component identification and disentanglement. Under this model, latent correlation maximization is shown to guarantee the extraction of the shared components across views (up to certain ambiguities). In addition, the private information in each view can be provably disentangled from the shared using proper regularization design. The method is tested on a series of tasks, e.g., downstream clustering, which all show promising performance. Our development also provides a unifying perspective for understanding various DCCA and SSL schemes.

READ FULL TEXT

page 10

page 29

page 30

research
09/19/2019

Neural Network-Assisted Nonlinear Multiview Component Analysis: Identifiability and Algorithm

Multiview analysis aims at extracting shared latent components from data...
research
07/23/2019

Shared Generative Latent Representation Learning for Multi-view Clustering

Clustering multi-view data has been a fundamental research topic in the ...
research
05/17/2021

Disentangled Variational Information Bottleneck for Multiview Representation Learning

Multiview data contain information from multiple modalities and have pot...
research
02/08/2017

Deep Generalized Canonical Correlation Analysis

We present Deep Generalized Canonical Correlation Analysis (DGCCA) -- a ...
research
12/30/2019

Multiview Representation Learning for a Union of Subspaces

Canonical correlation analysis (CCA) is a popular technique for learning...
research
05/20/2020

Adversarial Canonical Correlation Analysis

Canonical Correlation Analysis (CCA) is a statistical technique used to ...
research
07/09/2022

A Study on Self-Supervised Object Detection Pretraining

In this work, we study different approaches to self-supervised pretraini...

Please sign up or login with your details

Forgot password? Click here to reset