Kernel Flows: from learning kernels from data into the abyss

08/13/2018
by   Houman Owhadi, et al.
0

Learning can be seen as approximating an unknown function by interpolating the training data. Kriging offers a solution to this problem based on the prior specification of a kernel. We explore a numerical approximation approach to kernel selection/construction based on the simple premise that a kernel must be good if the number of interpolation points can be halved without significant loss in accuracy (measured using the intrinsic RKHS norm · associated with the kernel). We first test and motivate this idea on a simple problem of recovering the Green's function of an elliptic PDE (with inhomogeneous coefficients) from the sparse observation of one of its solutions. Next we consider the problem of learning non-parametric families of deep kernels of the form K_1(F_n(x),F_n(x')) with F_n+1=(I_d+ϵ G_n+1)∘ F_n and G_n+1∈Span{K_1(F_n(x_i),·)}. With the proposed approach constructing the kernel becomes equivalent to integrating a stochastic data driven dynamical system, which allows for the training of very deep (bottomless) networks and the exploration of their properties. These networks learn by constructing flow maps in the kernel and input spaces via incremental data-dependent deformations/perturbations (appearing as the cooperative counterpart of adversarial examples) and, at profound depths, they (1) can achieve accurate classification from only one data point per class (2) appear to learn archetypes of each class (3) expand distances between points that are in different classes and contract distances between points in the same class. For kernels parameterized by the weights of Convolutional Neural Network, minimizing approximation errors incurred by halving random subsets of interpolation points, appears to outperform training (the same CNN architecture) with relative entropy and dropout.

READ FULL TEXT

page 6

page 13

page 21

page 22

page 23

page 26

page 27

page 28

research
11/06/2015

Deep Kernel Learning

We introduce scalable deep kernels, which combine the structural propert...
research
06/03/2022

Learning "best" kernels from data in Gaussian process regression. With application to aerodynamics

This paper introduces algorithms to select/design kernels in Gaussian pr...
research
11/01/2016

Stochastic Variational Deep Kernel Learning

Deep kernel learning combines the non-parametric flexibility of kernel m...
research
09/04/2022

On Kernel Regression with Data-Dependent Kernels

The primary hyperparameter in kernel regression (KR) is the choice of ke...
research
11/25/2021

Learning dynamical systems from data: A simple cross-validation perspective, part III: Irregularly-Sampled Time Series

A simple and interpretable way to learn a dynamical system from data is ...
research
02/19/2020

Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows

We introduce a new regularization method for Artificial Neural Networks ...
research
01/04/2019

On Reproducing Kernel Banach Spaces: Generic Definitions and Unified Framework of Constructions

Recently, there has been emerging interest in constructing reproducing k...

Please sign up or login with your details

Forgot password? Click here to reset