Gradient-Based Feature Learning under Structured Data

09/07/2023
by   Alireza Mousavi Hosseini, et al.
0

Recent works have demonstrated that the sample complexity of gradient-based learning of single index models, i.e. functions that depend on a 1-dimensional projection of the input data, is governed by their information exponent. However, these results are only concerned with isotropic data, while in practice the input often contains additional structure which can implicitly guide the algorithm. In this work, we investigate the effect of a spiked covariance structure and reveal several interesting phenomena. First, we show that in the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction, even when the spike is perfectly aligned with the target direction. Next, we show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue. Further, by exploiting the alignment between the (spiked) input covariance and the target, we obtain improved sample complexity compared to the isotropic case. In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent while also outperforming lower bounds for rotationally invariant kernel methods.

READ FULL TEXT
research
07/28/2023

On Single Index Models beyond Gaussian Data

Sparse high-dimensional functions have arisen as a rich framework to stu...
research
05/18/2023

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

We focus on the task of learning a single index model σ(w^⋆· x) with res...
research
03/22/2022

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle

Q-learning with function approximation could diverge in the off-policy s...
research
02/01/2023

Sample Complexity of Kernel-Based Q-Learning

Modern reinforcement learning (RL) often faces an enormous state-action ...
research
06/28/2013

Memory Limited, Streaming PCA

We consider streaming, one-pass principal component analysis (PCA), in t...
research
10/27/2022

Learning Single-Index Models with Shallow Neural Networks

Single-index models are a class of functions given by an unknown univari...
research
02/13/2019

Learning Ising Models with Independent Failures

We give the first efficient algorithm for learning the structure of an I...

Please sign up or login with your details

Forgot password? Click here to reset