On the relationship between multitask neural networks and multitask Gaussian Processes

by   Karthikeyan K, et al.

Despite the effectiveness of multitask deep neural network (MTDNN), there is a limited theoretical understanding on how the information is shared across different tasks in MTDNN. In this work, we establish a formal connection between MTDNN with infinitely-wide hidden layers and multitask Gaussian Process (GP). We derive multitask GP kernels corresponding to both single-layer and deep multitask Bayesian neural networks (MTBNN) and show that information among different tasks is shared primarily due to correlation across last layer weights of MTBNN and shared hyper-parameters, which is contrary to the popular hypothesis that information is shared because of shared intermediate layer weights. Our construction enables using multitask GP to perform efficient Bayesian inference for the equivalent MTDNN with infinitely-wide hidden layers. Prior work on the connection between deep neural networks and GP for single task settings can be seen as special cases of our construction. We also present an adaptive multitask neural network architecture that corresponds to a multitask GP with more flexible kernels, such as Linear Model of Coregionalization (LMC) and Cross-Coregionalization (CC) kernels. We provide experimental results to further illustrate these ideas on synthetic and real datasets.


page 1

page 2

page 3

page 4


Deep Neural Networks as Gaussian Processes

A deep fully-connected neural network with an i.i.d. prior over its para...

Approximate Inference Turns Deep Networks into Gaussian Processes

Deep neural networks (DNN) and Gaussian processes (GP) are two powerful ...

Varying-coefficient models with isotropic Gaussian process priors

We study learning problems in which the conditional distribution of the ...

Bioplastic Design using Multitask Deep Neural Networks

Non-degradable plastic waste stays for decades on land and in water, jeo...

Flexible Modeling of Latent Task Structures in Multitask Learning

Multitask learning algorithms are typically designed assuming some fixed...

Multitask Learning Deep Neural Network to Combine Revealed and Stated Preference Data

It is an enduring question how to combine revealed preference (RP) and s...

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Deep neural networks are notorious for defying theoretical treatment. Ho...

Please sign up or login with your details

Forgot password? Click here to reset