Eliciting Transferability in Multi-task Learning with Task-level Mixture-of-Experts

05/25/2022
by   Qinyuan Ye, et al.
0

Recent work suggests that transformer models are capable of multi-task learning on diverse NLP tasks. However, the potential of these models may be limited as they use the same set of parameters for all tasks. In contrast, humans tackle tasks in a more flexible way, by making proper presumptions on what skills and knowledge are relevant and executing only the necessary computations. Inspired by this, we propose to use task-level mixture-of-expert models, which has a collection of transformer layers (i.e., experts) and a router component to choose among these experts dynamically and flexibly. We show that the learned routing decisions and experts partially rediscover human categorization of NLP tasks – certain experts are strongly associated with extractive tasks, some with classification tasks, and some with tasks requiring world knowledge.

READ FULL TEXT
research
04/16/2022

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

Traditional multi-task learning (MTL) methods use dense networks that us...
research
12/20/2022

RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction

In subcellular biological research, fluorescence staining is a key techn...
research
12/15/2022

Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

Optimization in multi-task learning (MTL) is more challenging than singl...
research
09/14/2021

The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders

Multi-task learning with transformer encoders (MTL) has emerged as a pow...
research
05/30/2023

Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts

Computer vision researchers are embracing two promising paradigms: Visio...
research
10/26/2022

M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

Multi-task learning (MTL) encapsulates multiple learned tasks in a singl...
research
05/23/2023

When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP

Multi-task learning (MTL) aims at achieving a better model by leveraging...

Please sign up or login with your details

Forgot password? Click here to reset