Multi-modal Self-Supervision from Generalized Data Transformations

03/09/2020
by   Mandela Patrick, et al.
12

Self-supervised learning has advanced rapidly, with several results beating supervised models for pre-training feature representations. While the focus of most of these works has been new loss functions or tasks, little attention has been given to the data transformations that build the foundation of learning representations with desirable invariances. In this work, we introduce a framework for multi-modal data transformations that preserve semantics and induce the learning of high-level representations across modalities. We do this by combining two steps: inter-modality slicing, and intra-modality augmentation. Using a contrastive loss as the training task, we show that choosing the right transformations is key and that our method yields state-of-the-art results on downstream video and audio classification tasks such as HMDB51, UCF101 and DCASE2014 with Kinetics-400 pretraining.

READ FULL TEXT
research
12/15/2022

MAViL: Masked Audio-Video Learners

We present Masked Audio-Video Learners (MAViL) to train audio-visual rep...
research
05/22/2023

Connecting Multi-modal Contrastive Representations

Multi-modal Contrastive Representation (MCR) learning aims to encode dif...
research
12/22/2021

Fine-grained Multi-Modal Self-Supervised Learning

Multi-Modal Self-Supervised Learning from videos has been shown to impro...
research
03/11/2021

Multi-Format Contrastive Learning of Audio Representations

Recent advances suggest the advantage of multi-modal training in compari...
research
03/25/2022

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Human-centric perception plays a vital role in vision and graphics. But ...
research
07/12/2023

Unified Molecular Modeling via Modality Blending

Self-supervised molecular representation learning is critical for molecu...
research
12/01/2022

FakeOut: Leveraging Out-of-domain Self-supervision for Multi-modal Video Deepfake Detection

Video synthesis methods rapidly improved in recent years, allowing easy ...

Please sign up or login with your details

Forgot password? Click here to reset