DeepAI AI Chat
Log In Sign Up

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

by   Bryn Elesedy, et al.

It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi arXiv:2102.10333 to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own right.


page 1

page 2

page 3

page 4


Provably Strict Generalisation Benefit for Equivariant Models

It is widely believed that engineering a model to be invariant/equivaria...

Learning with Group Invariant Features: A Kernel Perspective

We analyze in this paper a random feature map based on a theory of invar...

Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations

In this paper, we study deep signal representations that are invariant t...

The Exact Sample Complexity Gain from Invariances for Kernel Regression on Manifolds

In practice, encoding invariances into models helps sample complexity. I...

Measuring dissimilarity with diffeomorphism invariance

Measures of similarity (or dissimilarity) are a key ingredient to many m...

Learning with invariances in random features and kernel models

A number of machine learning tasks entail a high degree of invariance: t...

On the Sample Complexity of Learning with Geometric Stability

Many supervised learning problems involve high-dimensional data such as ...