Learning Polynomials of Few Relevant Dimensions

04/28/2020
by   Sitan Chen, et al.
3

Polynomial regression is a basic primitive in learning and statistics. In its most basic form the goal is to fit a degree d polynomial to a response variable y in terms of an n-dimensional input vector x. This is extremely well-studied with many applications and has sample and runtime complexity Θ(n^d). Can one achieve better runtime if the intrinsic dimension of the data is much smaller than the ambient dimension n? Concretely, we are given samples (x,y) where y is a degree at most d polynomial in an unknown r-dimensional projection (the relevant dimensions) of x. This can be seen both as a generalization of phase retrieval and as a special case of learning multi-index models where the link function is an unknown low-degree polynomial. Note that without distributional assumptions, this is at least as hard as junta learning. In this work we consider the important case where the covariates are Gaussian. We give an algorithm that learns the polynomial within accuracy ϵ with sample complexity that is roughly N = O_r,d(n log^2(1/ϵ) (log n)^d) and runtime O_r,d(N n^2). Prior to our work, no such results were known even for the case of r=1. We introduce a new filtered PCA approach to get a warm start for the true subspace and use geodesic SGD to boost to arbitrary accuracy; our techniques may be of independent interest, especially for problems dealing with subspace recovery or analyzing SGD on manifolds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD

We study the problem of training a two-layer neural network (NN) of arbi...
research
05/18/2023

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

We focus on the task of learning a single index model σ(w^⋆· x) with res...
research
10/27/2022

Learning Single-Index Models with Shallow Neural Networks

Single-index models are a class of functions given by an unknown univari...
research
11/13/2022

Near-Linear Sample Complexity for L_p Polynomial Regression

We study L_p polynomial regression. Given query access to a function f:[...
research
07/13/2018

Non-Gaussian Component Analysis using Entropy Methods

Non-Gaussian component analysis (NGCA) is a problem in multidimensional ...
research
04/04/2017

Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods

The problem of Non-Gaussian Component Analysis (NGCA) is about finding a...
research
02/23/2020

Conditional regression for single-index models

The single-index model is a statistical model for intrinsic regression w...

Please sign up or login with your details

Forgot password? Click here to reset