Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks

06/14/2022
by   Andrea Montanari, et al.
0

Given a cloud of n data points in ℝ^d, consider all projections onto m-dimensional subspaces of ℝ^d and, for each such projection, the empirical distribution of the projected points. What does this collection of probability distributions look like when n,d grow large? We consider this question under the null model in which the points are i.i.d. standard Gaussian vectors, focusing on the asymptotic regime in which n,d→∞, with n/d→α∈ (0,∞), while m is fixed. Denoting by ℱ_m, α the set of probability distributions in ℝ^m that arise as low-dimensional projections in this limit, we establish new inner and outer bounds on ℱ_m, α. In particular, we characterize the Wasserstein radius of ℱ_m,α up to logarithmic factors, and determine it exactly for m=1. We also prove sharp bounds in terms of Kullback-Leibler divergence and Rényi information dimension. The previous question has application to unsupervised learning methods, such as projection pursuit and independent component analysis. We introduce a version of the same problem that is relevant for supervised learning, and prove a sharp Wasserstein radius bound. As an application, we establish an upper bound on the interpolation threshold of two-layers neural networks with m hidden neurons.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Wasserstein Projection Pursuit of Non-Gaussian Signals

We consider the general dimensionality reduction problem of locating in ...
research
10/22/2020

Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

We develop a projected Wasserstein distance for the two-sample test, a f...
research
02/11/2022

Inference for Projection-Based Wasserstein Distances on Finite Spaces

The Wasserstein distance is a distance between two probability distribut...
research
01/30/2018

Rigorous Restricted Isometry Property of Low-Dimensional Subspaces

Dimensionality reduction is in demand to reduce the complexity of solvin...
research
04/08/2021

Visual Diagnostics for Constrained Optimisation with Application to Guided Tours

A guided tour helps to visualise high-dimensional data by showing low-di...
research
04/28/2020

Hole or grain? A Section Pursuit Index for Finding Hidden Structure in Multiple Dimensions

Multivariate data is often visualized using linear projections, produced...
research
12/22/2021

Refining Invariant Coordinate Selection via Local Projection Pursuit

Independent component selection (ICS), introduced by Tyler et al. (2009,...

Please sign up or login with your details

Forgot password? Click here to reset