Log In Sign Up

Robust Generalization of Quadratic Neural Networks via Function Identification

by   Kan Xu, et al.

A key challenge facing deep learning is that neural networks are often not robust to shifts in the underlying data distribution. We study this problem from the perspective of the statistical concept of parameter identification. Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. In contrast, if we can identify the "true" parameters, then the model generalizes to arbitrary distribution shifts. However, neural networks are typically overparameterized, making parameter identification impossible. We show that for quadratic neural networks, we can identify the function represented by the model even though we cannot identify its parameters. Thus, we can obtain robust generalization bounds even in the overparameterized setting. We leverage this result to obtain new bounds for contextual bandits and transfer learning with quadratic neural networks. Overall, our results suggest that we can improve robustness of neural networks by designing models that can represent the true data generating process. In practice, the true data generating process is often very complex; thus, we study how our framework might connect to neural module networks, which are designed to break down complex tasks into compositions of simpler ones. We prove robust generalization bounds when individual neural modules are identifiable.


page 1

page 2

page 3

page 4


Uniform Generalization Bounds for Overparameterized Neural Networks

An interesting observation in artificial neural networks is their favora...

Memory Classifiers: Two-stage Classification for Robustness in Machine Learning

The performance of machine learning models can significantly degrade und...

Quadratic models for understanding neural network dynamics

In this work, we propose using a quadratic model as a tool for understan...

On the Generalization and Adaption Performance of Causal Models

Learning models that offer robust out-of-distribution generalization and...

A neural anisotropic view of underspecification in deep learning

The underspecification of most machine learning pipelines means that we ...

Understanding Robust Generalization in Learning Regular Languages

A key feature of human intelligence is the ability to generalize beyond ...

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Training models that perform well under distribution shifts is a central...