Dimensionality Reduction for Tukey Regression

05/14/2019
by   Kenneth L. Clarkson, et al.
0

We give the first dimensionality reduction methods for the overconstrained Tukey regression problem. The Tukey loss function y_M = ∑_i M(y_i) has M(y_i) ≈ |y_i|^p for residual errors y_i smaller than a prescribed threshold τ, but M(y_i) becomes constant for errors |y_i| > τ. Our results depend on a new structural result, proven constructively, showing that for any d-dimensional subspace L ⊂R^n, there is a fixed bounded-size subset of coordinates containing, for every y ∈ L, all the large coordinates, with respect to the Tukey loss function, of y. Our methods reduce a given Tukey regression problem to a smaller weighted version, whose solution is a provably good approximate solution to the original problem. Our reductions are fast, simple and easy to implement, and we give empirical results demonstrating their practicality, using existing heuristic solvers for the small versions. We also give exponential-time algorithms giving provably good solutions, and hardness results suggesting that a significant speedup in the worst case is unlikely.

READ FULL TEXT

Authors

10/18/2021

Dimensionality Reduction for Wasserstein Barycenter

The Wasserstein barycenter is a geometric construct which captures the n...
04/09/2020

TensorProjection Layer: A Tensor-Based Dimensionality Reduction Method in CNN

In this paper, we propose a dimensionality reduction method applied to t...
05/01/2019

Coordinatizing Data With Lens Spaces and Persistent Cohomology

We introduce here a framework to construct coordinates in finite Lens sp...
06/18/2020

Precise expressions for random projections: Low-rank approximation and randomized Newton

It is often desirable to reduce the dimensionality of a large dataset by...
09/25/2018

Sparse Circular Coordinates via Principal Z-Bundles

We present in this paper an application of the theory of principal bundl...
06/09/2020

Faster PAC Learning and Smaller Coresets via Smoothed Analysis

PAC-learning usually aims to compute a small subset (ε-sample/net) from ...
10/10/2019

Efficient Sketching Algorithm for Sparse Binary Data

Recent advancement of the WWW, IOT, social network, e-commerce, etc. hav...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.