AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

11/28/2022
by   Enneng Yang, et al.
0

Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different tasks on each parameter still remains unclear. In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter. Specifically, we compute the total updates by the exponentially decaying Average of the squared Updates (AU) on a parameter from the corresponding task.Based on this novel metric, we observe that many parameters in existing MTL methods, especially those in the higher shared layers, are still dominated by one or several tasks. The dominance of AU is mainly due to the dominance of accumulative gradients from one or several tasks. Motivated by this, we propose a Task-wise Adaptive learning rate approach, AdaTask in short, to separate the accumulative gradients and hence the learning rate of each task for each parameter in adaptive learning rate approaches (e.g., AdaGrad, RMSProp, and Adam). Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. Analysis on both synthetic and real-world datasets shows AdaTask balance parameters in every shared layer well.

READ FULL TEXT
research
05/23/2023

Advances and Challenges of Multi-task Learning Method in Recommender System: A Survey

Multi-task learning has been widely applied in computational vision, nat...
research
01/17/2022

MT-GBM: A Multi-Task Gradient Boosting Machine with Shared Decision Trees

Despite the success of deep learning in computer vision and natural lang...
research
04/01/2021

Learning Rates for Multi-task Regularization Networks

Multi-task learning is an important trend of machine learning in facing ...
research
03/19/2022

Meta-Learning for Online Update of Recommender Systems

Online recommender systems should be always aligned with users' current ...
research
05/11/2021

TAG: Task-based Accumulated Gradients for Lifelong learning

When an agent encounters a continual stream of new tasks in the lifelong...
research
06/20/2021

Memory Augmented Optimizers for Deep Learning

Popular approaches for minimizing loss in data-driven learning often inv...
research
02/06/2022

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

Recent research has shown the existence of significant redundancy in lar...

Please sign up or login with your details

Forgot password? Click here to reset