DeepAI AI Chat
Log In Sign Up

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

by   Carles Domingo Enrich, et al.

Several works in implicit and explicit generative modeling empirically observed that feature-learning discriminators outperform fixed-kernel discriminators in terms of the sample quality of the models. We provide separation results between probability metrics with fixed-kernel and feature-learning discriminators using the function classes ℱ_2 and ℱ_1 respectively, which were developed to study overparametrized two-layer neural networks. In particular, we construct pairs of distributions over hyper-spheres that can not be discriminated by fixed kernel (ℱ_2) integral probability metric (IPM) and Stein discrepancy (SD) in high dimensions, but that can be discriminated by their feature learning (ℱ_1) counterparts. To further study the separation we provide links between the ℱ_1 and ℱ_2 IPMs with sliced Wasserstein distances. Our work suggests that fixed-kernel discriminators perform worse than their feature learning counterparts because their corresponding metrics are weaker.


page 1

page 2

page 3

page 4


A Theory of Feature Learning

Feature Learning aims to extract relevant information contained in data ...

Feature learning in neural networks and kernel machines that recursively learn features

Neural networks have achieved impressive results on many technological a...

The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes

For small training set sizes P, the generalization error of wide neural ...

Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks

We analyze the learning dynamics of infinitely wide neural networks with...

Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks

We analyze feature learning in infinite width neural networks trained wi...

Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators

We construct pairs of distributions μ_d, ν_d on ℝ^d such that the quanti...

Unsupervised Learning of Spatiotemporally Coherent Metrics

Current state-of-the-art classification and detection algorithms rely on...