Generalization in multitask deep neural classifiers: a statistical physics approach

10/30/2019
by   Tyler Lee, et al.
0

A proper understanding of the striking generalization abilities of deep neural networks presents an enduring puzzle. Recently, there has been a growing body of numerically-grounded theoretical work that has contributed important insights to the theory of learning in deep neural nets. There has also been a recent interest in extending these analyses to understanding how multitask learning can further improve the generalization capacity of deep neural nets. These studies deal almost exclusively with regression tasks which are amenable to existing analytical techniques. We develop an analytic theory of the nonlinear dynamics of generalization of deep neural networks trained to solve classification tasks using softmax outputs and cross-entropy loss, addressing both single task and multitask settings. We do so by adapting techniques from the statistical physics of disordered systems, accounting for both finite size datasets and correlated outputs induced by the training dynamics. We discuss the validity of our theoretical results in comparison to a comprehensive suite of numerical experiments. Our analysis provides theoretical support for the intuition that the performance of multitask learning is determined by the noisiness of the tasks and how well their input features align with each other. Highly related, clean tasks benefit each other, whereas unrelated, clean tasks can be detrimental to individual task performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2017

GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

Deep multitask networks, in which one neural network produces multiple p...
research
06/28/2016

Modeling Industrial ADMET Data with Multitask Networks

Deep learning methods such as multitask neural networks have recently be...
research
09/18/2018

On the Learning Dynamics of Deep Neural Networks

While a lot of progress has been made in recent years, the dynamics of l...
research
04/05/2016

Deep Cross Residual Learning for Multitask Visual Recognition

Residual learning has recently surfaced as an effective means of constru...
research
05/30/2018

The Dynamics of Learning: A Random Matrix Approach

Understanding the learning dynamics of neural networks is one of the key...
research
03/21/2019

A Principled Approach for Learning Task Similarity in Multitask Learning

Multitask learning aims at solving a set of related tasks simultaneously...
research
06/04/2021

Multitask Online Mirror Descent

We introduce and analyze MT-OMD, a multitask generalization of Online Mi...

Please sign up or login with your details

Forgot password? Click here to reset