Fast Neural Kernel Embeddings for General Activations

09/09/2022
by   Insu Han, et al.
0

Infinite width limit has shed light on generalization and optimization aspects of deep learning by establishing connections between neural networks and kernel methods. Despite their importance, the utility of these kernel methods was limited in large-scale learning settings due to their (super-)quadratic runtime and memory complexities. Moreover, most prior works on neural kernels have focused on the ReLU activation, mainly due to its popularity but also due to the difficulty of computing such kernels for general activations. In this work, we overcome such difficulties by providing methods to work with general activations. First, we compile and expand the list of activation functions admitting exact dual activation expressions to compute neural kernels. When the exact computation is unknown, we present methods to effectively approximate them. We propose a fast sketching method that approximates any multi-layered Neural Network Gaussian Process (NNGP) kernel and Neural Tangent Kernel (NTK) matrices for a wide range of activation functions, going beyond the commonly analyzed ReLU activation. This is done by showing how to approximate the neural kernels using the truncated Hermite expansion of any desired activation functions. While most prior works require data points on the unit sphere, our methods do not suffer from such limitations and are applicable to any dataset of points in ℝ^d. Furthermore, we provide a subspace embedding for NNGP and NTK matrices with near input-sparsity runtime and near-optimal target dimension which applies to any homogeneous dual activation functions with rapidly convergent Taylor expansion. Empirically, with respect to exact convolutional NTK (CNTK) computation, our method achieves 106× speedup for approximate CNTK of a 5-layer Myrtle network on CIFAR-10 dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2020

Memory capacity of neural networks with threshold and ReLU activations

Overwhelming theoretical and empirical evidence shows that mildly overpa...
research
11/02/2018

Efficient Neural Network Robustness Certification with General Activation Functions

Finding minimum distortion of adversarial examples and thus certifying r...
research
02/20/2020

Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite Networks

Analysing and computing with Gaussian processes arising from infinitely ...
research
02/06/2019

Widely Linear Kernels for Complex-Valued Kernel Activation Functions

Complex-valued neural networks (CVNNs) have been shown to be powerful no...
research
04/01/2021

Learning with Neural Tangent Kernels in Near Input Sparsity Time

The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely...
research
10/19/2020

Stationary Activations for Uncertainty Calibration in Deep Learning

We introduce a new family of non-linear neural network activation functi...
research
11/08/2017

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

In this paper, we consider parameter recovery for non-overlapping convol...

Please sign up or login with your details

Forgot password? Click here to reset