FAMO: Fast Adaptive Multitask Optimization

06/06/2023
by   Bo Liu, et al.
0

One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more balanced loss decrease require storing and computing all task gradients (O(K) space and time where K is the number of tasks), limiting their use in large-scale scenarios. In this work, we introduce Fast Adaptive Multitask Optimization (FAMO), a dynamic weighting method that decreases task losses in a balanced way using O(1) space and time. We conduct an extensive set of experiments covering multi-task supervised and reinforcement learning problems. Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques while offering significant improvements in space and computational efficiency. Code is available at https://github.com/Cranial-XIX/FAMO.

READ FULL TEXT

page 1

page 9

research
10/26/2021

Conflict-Averse Gradient Descent for Multi-task Learning

The goal of multi-task learning is to enable more efficient learning tha...
research
11/07/2017

GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

Deep multitask networks, in which one neural network produces multiple p...
research
06/09/2022

DiSparse: Disentangled Sparsification for Multitask Model Compression

Despite the popularity of Model Compression and Multitask Learning, how ...
research
03/11/2018

Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing---and Back

Deep multitask learning boosts performance by sharing learned structure ...
research
02/07/2022

Auto-Lambda: Disentangling Dynamic Task Relationships

Understanding the structure of multiple related tasks allows for multi-t...
research
06/04/2021

Multitask Online Mirror Descent

We introduce and analyze MT-OMD, a multitask generalization of Online Mi...
research
06/22/2022

Dynamic Restrained Uncertainty Weighting Loss for Multitask Learning of Vocal Expression

We propose a novel Dynamic Restrained Uncertainty Weighting Loss to expe...

Please sign up or login with your details

Forgot password? Click here to reset