Robust Coreset for Continuous-and-Bounded Learning (with Outliers)

06/30/2021
by   Zixiu Wang, et al.
0

In this big data era, we often confront large-scale data in many machine learning tasks. A common approach for dealing with large-scale data is to build a small summary, e.g., coreset, that can efficiently represent the original input. However, real-world datasets usually contain outliers and most existing coreset construction methods are not resilient against outliers (in particular, the outliers can be located arbitrarily in the space by an adversarial attacker). In this paper, we propose a novel robust coreset method for the continuous-and-bounded learning problem (with outliers) which includes a broad range of popular optimization objectives in machine learning, like logistic regression and k-means clustering. Moreover, our robust coreset can be efficiently maintained in fully-dynamic environment. To the best of our knowledge, this is the first robust and fully-dynamic coreset construction method for these optimization problems. We also conduct the experiments to evaluate the effectiveness of our robust coreset in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

Layered Sampling for Robust Optimization Problems

In real world, our datasets often contain outliers. Moreover, the outlie...
research
10/09/2022

Coresets for Wasserstein Distributionally Robust Optimization Problems

Wasserstein distributionally robust optimization () is a popular model t...
research
04/11/2019

Robust Coreset Construction for Distributed Machine Learning

Motivated by the need of solving machine learning problems over distribu...
research
08/16/2021

Robust Trimmed k-means

Clustering is a fundamental tool in unsupervised learning, used to group...
research
06/11/2021

DORO: Distributional and Outlier Robust Optimization

Many machine learning tasks involve subpopulation shift where the testin...
research
01/13/2015

Random Bits Regression: a Strong General Predictor for Big Data

To improve accuracy and speed of regressions and classifications, we pre...
research
06/17/2020

Robust Meta-learning for Mixed Linear Regression with Small Batches

A common challenge faced in practical supervised learning, such as medic...

Please sign up or login with your details

Forgot password? Click here to reset