Can neural networks extrapolate? Discussion of a theorem by Pedro Domingos

11/07/2022
by   Adrien Courtois, et al.
0

Neural networks trained on large datasets by minimizing a loss have become the state-of-the-art approach for resolving data science problems, particularly in computer vision, image processing and natural language processing. In spite of their striking results, our theoretical understanding about how neural networks operate is limited. In particular, what are the interpolation capabilities of trained neural networks? In this paper we discuss a theorem of Domingos stating that "every machine learned by continuous gradient descent is approximately a kernel machine". According to Domingos, this fact leads to conclude that all machines trained on data are mere kernel machines. We first extend Domingo's result in the discrete case and to networks with vector-valued output. We then study its relevance and significance on simple examples. We find that in simple cases, the "neural tangent kernel" arising in Domingos' theorem does provide understanding of the networks' predictions. Furthermore, when the task given to the network grows in complexity, the interpolation capability of the network can be effectively explained by Domingos' theorem, and therefore is limited. We illustrate this fact on a classic perception theory problem: recovering a shape from its boundary.

READ FULL TEXT

page 11

page 12

page 13

page 17

research
11/30/2020

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

Deep learning's successes are often attributed to its ability to automat...
research
09/06/2022

Extending the Universal Approximation Theorem for a Broad Class of Hypercomplex-Valued Neural Networks

The universal approximation theorem asserts that a single hidden layer n...
research
08/01/2023

An Exact Kernel Equivalence for Finite Classification Models

We explore the equivalence between neural networks and kernel methods by...
research
08/31/2023

Training Neural Networks Using Reproducing Kernel Space Interpolation and Model Reduction

We introduce and study the theory of training neural networks using inte...
research
03/28/2023

Kernel interpolation generalizes poorly

One of the most interesting problems in the recent renaissance of the st...
research
09/19/2018

A Generalized Representer Theorem for Hilbert Space - Valued Functions

The necessary and sufficient conditions for existence of a generalized r...
research
07/04/2022

Automating the Design and Development of Gradient Descent Trained Expert System Networks

Prior work introduced a gradient descent trained expert system that conc...

Please sign up or login with your details

Forgot password? Click here to reset