Online Lewis Weight Sampling
The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating d-dimensional subspaces of ℓ_p up to (1+ϵ) error. Several works have extended this important primitive to other settings, including the online coreset, sliding window, and adversarial streaming models. However, these results are only for p∈{1,2}, and results for p=1 require a suboptimal Õ(d^2/ϵ^2) samples. In this work, we design the first nearly optimal ℓ_p subspace embeddings for all p∈(0,∞) in the online coreset, sliding window, and the adversarial streaming models. In all three models, our algorithms store Õ(d^1(p/2)/ϵ^2) rows. This answers a substantial generalization of the main open question of [BDMMUWZ2020], and gives the first results for all p∉{1,2}. Towards our result, we give the first analysis of "one-shot” Lewis weight sampling of sampling rows proportionally to their Lewis weights, with sample complexity Õ(d^p/2/ϵ^2) for p>2. Previously, this scheme was only known to have sample complexity Õ(d^p/2/ϵ^5), whereas Õ(d^p/2/ϵ^2) is known if a more sophisticated recursive sampling is used. The recursive sampling cannot be implemented online, thus necessitating an analysis of one-shot Lewis weight sampling. Our analysis uses a novel connection to online numerical linear algebra. As an application, we obtain the first one-pass streaming coreset algorithms for (1+ϵ) approximation of important generalized linear models, such as logistic regression and p-probit regression. Our upper bounds are parameterized by a complexity parameter μ introduced by [MSSW2018], and we show the first lower bounds showing that a linear dependence on μ is necessary.
READ FULL TEXT