Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

05/20/2022
by   Ravid Shwartz-Ziv, et al.
17

Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task. But an initialization contains relatively little information about the source task. Instead, we show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks, serving as a drop-in replacement for standard pre-training strategies. These highly informative priors also can be saved for future use, similar to pre-trained weights, and stand in contrast to the zero-mean isotropic uninformative priors that are typically used in Bayesian deep learning.

READ FULL TEXT
research
07/19/2021

Adaptive Transfer Learning on Graph Neural Networks

Graph neural networks (GNNs) is widely used to learn a powerful represen...
research
02/19/2020

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

Universal feature extractors, such as BERT for natural language processi...
research
05/26/2022

Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing

Self-supervised learning (SSL), especially contrastive methods, has rais...
research
06/01/2023

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Pre-trained machine learning (ML) models have shown great performance fo...
research
12/23/2022

Principled and Efficient Transfer Learning of Deep Models via Neural Collapse

With the ever-growing model size and the limited availability of labeled...
research
11/22/2021

Benchmarking Detection Transfer Learning with Vision Transformers

Object detection is a central downstream task used to test if pre-traine...
research
05/02/2022

Jack and Masters of All Trades: One-Pass Learning of a Set of Model Sets from Foundation AI Models

For deep learning, size is power. Massive neural nets trained on broad d...

Please sign up or login with your details

Forgot password? Click here to reset