A Deterministic Streaming Sketch for Ridge Regression
We provide a deterministic space-efficient algorithm for estimating ridge regression. For n data points with d features and a large enough regularization parameter, we provide a solution within ε L_2 error using only O(d/ε) space. This is the first o(d^2) space algorithm for this classic problem. The algorithm sketches the covariance matrix by variants of Frequent Directions, which implies it can operate in insertion-only streams and a variety of distributed data settings. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.
READ FULL TEXT