Attentive Task Interaction Network for Multi-Task Learning

01/25/2022
by   Dimitrios Sinodinos, et al.
0

Multitask learning (MTL) has recently gained a lot of popularity as a learning paradigm that can lead to improved per-task performance while also using fewer per-task model parameters compared to single task learning. One of the biggest challenges regarding MTL networks involves how to share features across tasks. To address this challenge, we propose the Attentive Task Interaction Network (ATI-Net). ATI-Net employs knowledge distillation of the latent features for each task, then combines the feature maps to provide improved contextualized information to the decoder. This novel approach to introducing knowledge distillation into an attention based multitask network outperforms state of the art MTL baselines such as the standalone MTAN and PAD-Net, with roughly the same number of model parameters.

READ FULL TEXT
research
11/09/2019

Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation for Pretrained Models

In this paper, we explore the knowledge distillation approach under the ...
research
09/18/2023

Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation

Much research effort is being applied to the task of compressing the kno...
research
01/19/2020

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning

In this paper, we highlight the importance of considering task interacti...
research
03/24/2022

Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator

Due to the collection of big data and the development of deep learning, ...
research
07/04/2019

Graph-based Knowledge Distillation by Multi-head Attention Network

Knowledge distillation (KD) is a technique to derive optimal performance...
research
10/05/2022

Meta-Ensemble Parameter Learning

Ensemble of machine learning models yields improved performance as well ...

Please sign up or login with your details

Forgot password? Click here to reset