Improving Multi-task Learning via Seeking Task-based Flat Regions

11/24/2022
by   Hoang Phan, et al.
0

Multi-Task Learning (MTL) is a widely-used and powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone. Compared to training tasks separately, MTL significantly reduces computational costs, improves data efficiency, and potentially enhances model performance by leveraging knowledge across tasks. Hence, it has been adopted in a variety of applications, ranging from computer vision to natural language processing and speech recognition. Among them, there is an emerging line of work in MTL that focuses on manipulating the task gradient to derive an ultimate gradient descent direction to benefit all tasks. Despite achieving impressive results on many benchmarks, directly applying these approaches without using appropriate regularization techniques might lead to suboptimal solutions on real-world problems. In particular, standard training that minimizes the empirical loss on the training data can easily suffer from overfitting to low-resource tasks or be spoiled by noisy-labeled ones, which can cause negative transfer between tasks and overall performance drop. To alleviate such problems, we propose to leverage a recently introduced training method, named Sharpness-aware Minimization, which can enhance model generalization ability on single-task learning. Accordingly, we present a novel MTL training methodology, encouraging the model to find task-based flat minima for coherently improving its generalization capability on all tasks. Finally, we conduct comprehensive experiments on a variety of applications to demonstrate the merit of our proposed approach to existing gradient-based MTL methods, as suggested by our developed theory.

READ FULL TEXT
research
09/10/2021

Efficiently Identifying Task Groupings for Multi-Task Learning

Multi-task learning can leverage information learned by one task to bene...
research
09/21/2023

Multi-Task Cooperative Learning via Searching for Flat Minima

Multi-task learning (MTL) has shown great potential in medical image ana...
research
10/29/2020

Measuring and Harnessing Transference in Multi-Task Learning

Multi-task learning can leverage information learned by one task to bene...
research
05/07/2021

SpeechNet: A Universal Modularized Model for Speech Processing Tasks

There is a wide variety of speech processing tasks ranging from extracti...
research
02/18/2023

MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio for Multi-Task Learning

When modeling related tasks in computer vision, Multi-Task Learning (MTL...
research
10/06/2022

Generalization Properties of Retrieval-based Models

Many modern high-performing machine learning models such as GPT-3 primar...
research
07/07/2023

TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning

The AllInOne training paradigm squeezes a wide range of tasks into a uni...

Please sign up or login with your details

Forgot password? Click here to reset