Robust Generalization of Quadratic Neural Networks via Function Identification

09/22/2021
by   Kan Xu, et al.
0

A key challenge facing deep learning is that neural networks are often not robust to shifts in the underlying data distribution. We study this problem from the perspective of the statistical concept of parameter identification. Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. In contrast, if we can identify the "true" parameters, then the model generalizes to arbitrary distribution shifts. However, neural networks are typically overparameterized, making parameter identification impossible. We show that for quadratic neural networks, we can identify the function represented by the model even though we cannot identify its parameters. Thus, we can obtain robust generalization bounds even in the overparameterized setting. We leverage this result to obtain new bounds for contextual bandits and transfer learning with quadratic neural networks. Overall, our results suggest that we can improve robustness of neural networks by designing models that can represent the true data generating process. In practice, the true data generating process is often very complex; thus, we study how our framework might connect to neural module networks, which are designed to break down complex tasks into compositions of simpler ones. We prove robust generalization bounds when individual neural modules are identifiable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2021

Uniform Generalization Bounds for Overparameterized Neural Networks

An interesting observation in artificial neural networks is their favora...
research
06/10/2022

Memory Classifiers: Two-stage Classification for Robustness in Machine Learning

The performance of machine learning models can significantly degrade und...
research
05/24/2022

Quadratic models for understanding neural network dynamics

In this work, we propose using a quadratic model as a tool for understan...
research
06/09/2022

On the Generalization and Adaption Performance of Causal Models

Learning models that offer robust out-of-distribution generalization and...
research
04/14/2023

Performative Prediction with Neural Networks

Performative prediction is a framework for learning models that influenc...
research
04/29/2021

A neural anisotropic view of underspecification in deep learning

The underspecification of most machine learning pipelines means that we ...
research
10/03/2019

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Recent theoretical work has established connections between over-paramet...

Please sign up or login with your details

Forgot password? Click here to reset