HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections

07/12/2020
by   Yi Tay, et al.
0

Achieving state-of-the-art performance on natural language understanding tasks typically relies on fine-tuning a fresh model for every task. Consequently, this approach leads to a higher overall parameter cost, along with higher technical maintenance for serving multiple models. Learning a single multi-task model that is able to do well for all the tasks has been a challenging and yet attractive proposition. In this paper, we propose HyperGrid, a new approach for highly effective multi-task learning. The proposed approach is based on a decomposable hypernetwork that learns grid-wise projections that help to specialize regions in weight matrices for different tasks. In order to construct the proposed hypernetwork, our method learns the interactions and composition between a global (task-agnostic) state and a local task-specific state. We apply our proposed HyperGrid on the current state-of-the-art T5 model, demonstrating strong performance across the GLUE and SuperGLUE benchmarks when using only a single multi-task model. Our method helps bridge the gap between fine-tuning and multi-task learning approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

State-of-the-art parameter-efficient fine-tuning methods rely on introdu...
research
07/10/2019

BAM! Born-Again Multi-Task Networks for Natural Language Understanding

It can be challenging to train multi-task neural networks that outperfor...
research
07/12/2021

A Flexible Multi-Task Model for BERT Serving

In this demonstration, we present an efficient BERT-based multi-task (MT...
research
03/06/2023

Model-Agnostic Meta-Learning for Natural Language Understanding Tasks in Finance

Natural language understanding(NLU) is challenging for finance due to th...
research
01/29/2023

Unifying Molecular and Textual Representations via Multi-task Language Modelling

The recent advances in neural language models have also been successfull...
research
09/21/2022

Grape Cold Hardiness Prediction via Multi-Task Learning

Cold temperatures during fall and spring have the potential to cause fro...
research
05/16/2023

Multi-task convolutional neural network for image aesthetic assessment

As people's aesthetic preferences for images are far from understood, im...

Please sign up or login with your details

Forgot password? Click here to reset