Scalable Transfer Learning with Expert Models

09/28/2020
by   Joan Puigcerver, et al.
11

Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploiting existing label structures, and use cheap-to-compute performance proxies to select the relevant expert for each target task. This strategy scales the process of transferring to new tasks, since it does not revisit the pre-training data during transfer. Accordingly, it requires little extra compute per target task, and results in a speed-up of 2-3 orders of magnitude compared to competing approaches. Further, we provide an adapter-based architecture able to compress many experts into a single model. We evaluate our approach on two different data sources and demonstrate that it outperforms baselines on over 20 diverse vision tasks in both cases.

READ FULL TEXT

page 2

page 15

page 16

page 17

page 19

page 20

page 27

research
06/19/2022

Scalable Neural Data Server: A Data Recommender for Transfer Learning

Absence of large-scale labeled data in the practitioner's target domain ...
research
03/14/2023

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm

Parameter-Efficient Transfer Learning (PETL) aims at efficiently adaptin...
research
12/24/2019

Large Scale Learning of General Visual Representations for Transfer

Transfer of pre-trained representations improves sample efficiency and s...
research
06/01/2023

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Pre-trained machine learning (ML) models have shown great performance fo...
research
04/25/2023

Towards Compute-Optimal Transfer Learning

The field of transfer learning is undergoing a significant shift with th...
research
10/13/2020

Which Model to Transfer? Finding the Needle in the Growing Haystack

Transfer learning has been recently popularized as a data-efficient alte...
research
09/26/2022

Diversified Dynamic Routing for Vision Tasks

Deep learning models for vision tasks are trained on large datasets unde...

Please sign up or login with your details

Forgot password? Click here to reset