Robust Regression via Online Feature Selection under Adversarial Data Corruption

02/05/2019
by   Xuchao Zhang, et al.
0

The presence of data corruption in user-generated streaming data, such as social media, motivates a new fundamental problem that learns reliable regression coefficient when features are not accessible entirely at one time. Until now, several important challenges still cannot be handled concurrently: 1) corrupted data estimation when only partial features are accessible; 2) online feature selection when data contains adversarial corruption; and 3) scaling to a massive dataset. This paper proposes a novel RObust regression algorithm via Online Feature Selection (RoOFS) that concurrently addresses all the above challenges. Specifically, the algorithm iteratively updates the regression coefficients and the uncorrupted set via a robust online feature substitution method. We also prove that our algorithm has a restricted error bound compared to the optimal solution. Extensive empirical experiments in both synthetic and real-world datasets demonstrated that the effectiveness of our new method is superior to that of existing methods in the recovery of both feature selection and regression coefficients, with very competitive efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2017

Online and Distributed Robust Regressions under Adversarial Data Corruption

In today's era of big data, robust least-squares regression becomes a mo...
research
06/17/2015

Feature Selection for Ridge Regression with Provable Guarantees

We introduce single-set spectral sparsification as a deterministic sampl...
research
05/30/2009

A Minimum Description Length Approach to Multitask Feature Selection

Many regression problems involve not one but several response variables ...
research
10/19/2012

A Distance-Based Branch and Bound Feature Selection Algorithm

There is no known efficient method for selecting k Gaussian features fro...
research
04/28/2023

Online Platt Scaling with Calibeating

We present an online post-hoc calibration method, called Online Platt Sc...
research
12/15/2021

Online Feature Selection for Efficient Learning in Networked Systems

Current AI/ML methods for data-driven engineering use models that are mo...
research
10/02/2019

Geometric Online Adaptation: Graph-Based OSFS for Streaming Samples

Feature selection seeks a curated subset of available features such that...

Please sign up or login with your details

Forgot password? Click here to reset