Integrating Random Effects in Deep Neural Networks

06/07/2022
by   Giora Simchoni, et al.
0

Modern approaches to supervised learning like deep neural networks (DNNs) typically implicitly assume that observed responses are statistically independent. In contrast, correlated data are prevalent in real-life large-scale applications, with typical sources of correlation including spatial, temporal and clustering structures. These correlations are either ignored by DNNs, or ad-hoc solutions are developed for specific use cases. We propose to use the mixed models framework to handle correlated data in DNNs. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates and ultimately yield better predictive performance. The key to combining mixed models and DNNs is using the Gaussian negative log-likelihood (NLL) as a natural loss function that is minimized with DNN machinery including stochastic gradient descent (SGD). Since NLL does not decompose like standard DNN loss functions, the use of SGD with NLL presents some theoretical and implementation challenges, which we address. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios on diverse simulated and real datasets. Our focus is on a regression setting and tabular datasets, but we also show some results for classification. Our code is available at https://github.com/gsimchoni/lmmnn.

READ FULL TEXT

page 11

page 25

research
11/19/2021

Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits

Stochastic gradient descent (SGD) and its variants have established them...
research
12/21/2020

Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent

Deep neural networks (DNNs) have achieved remarkable success in computer...
research
04/13/2023

Do deep neural networks have an inbuilt Occam's razor?

The remarkable performance of overparameterized deep neural networks (DN...
research
08/23/2023

Multi-Objective Optimization for Sparse Deep Neural Network Training

Different conflicting optimization criteria arise naturally in various D...
research
07/22/2021

Selective Pseudo-label Clustering

Deep neural networks (DNNs) offer a means of addressing the challenging ...
research
05/11/2022

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

Advanced deep neural networks (DNNs), designed by either human or AutoML...
research
11/12/2020

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Over the last decade, a single algorithm has changed many facets of our ...

Please sign up or login with your details

Forgot password? Click here to reset