Rethinking Hard-Parameter Sharing in Multi-Task Learning

07/23/2021
by   Moshe Y. Vardi, et al.
0

Hard parameter sharing in multi-task learning (MTL) allows tasks to share some of model parameters, reducing storage cost and improving prediction accuracy. The common sharing practice is to share bottom layers of a deep neural network among tasks while using separate top layers for each task. In this work, we revisit this common practice via an empirical study on fine-grained image classification tasks and make two surprising observations. (1) Using separate bottom-layer parameters could achieve significantly better performance than the common practice and this phenomenon holds for different number of tasks jointly trained on different backbone architectures with different quantity of task-specific parameters. (2) A multi-task model with a small proportion of task-specific parameters from bottom layers can achieve competitive performance with independent models trained on each task separately and outperform a state-of-the-art MTL framework. Our observations suggest that people rethink the current sharing paradigm and adopt the new strategy of using separate bottom-layer parameters as a stronger baseline for model design in MTL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2019

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Multi-task learning is an open and challenging problem in computer visio...
research
01/22/2021

Network Clustering for Multi-task Learning

The Multi-Task Learning (MTL) technique has been widely studied by word-...
research
05/23/2017

Sluice networks: Learning what to share between loosely related tasks

Multi-task learning is partly motivated by the observation that humans b...
research
08/23/2023

OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes

In this work, we seek to predict camera poses across scenes with a multi...
research
03/23/2020

Learned Weight Sharing for Deep Multi-Task Learning by Natural Evolution Strategy and Stochastic Gradient Descent

In deep multi-task learning, weights of task-specific networks are share...
research
07/27/2021

Exceeding the Limits of Visual-Linguistic Multi-Task Learning

By leveraging large amounts of product data collected across hundreds of...
research
04/05/2019

Branched Multi-Task Networks: Deciding What Layers To Share

In the context of deep learning, neural networks with multiple branches ...

Please sign up or login with your details

Forgot password? Click here to reset