Systematic Testing of the Data-Poisoning Robustness of KNN

07/17/2023
by   Yannan Li, et al.
0

Data poisoning aims to compromise a machine learning based software component by contaminating its training set to change its prediction results for test inputs. Existing methods for deciding data-poisoning robustness have either poor accuracy or long running time and, more importantly, they can only certify some of the truly-robust cases, but remain inconclusive when certification fails. In other words, they cannot falsify the truly-non-robust cases. To overcome this limitation, we propose a systematic testing based method, which can falsify as well as certify data-poisoning robustness for a widely used supervised-learning technique named k-nearest neighbors (KNN). Our method is faster and more accurate than the baseline enumeration method, due to a novel over-approximate analysis in the abstract domain, to quickly narrow down the search space, and systematic testing in the concrete domain, to find the actual violations. We have evaluated our method on a set of supervised-learning datasets. Our results show that the method significantly outperforms state-of-the-art techniques, and can decide data-poisoning robustness of KNN prediction results for most of the test inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2023

Certifying the Fairness of KNN in the Presence of Dataset Bias

We propose a method for certifying the fairness of the classification re...
research
06/21/2022

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

Context: Machine learning (ML) may enable effective automated test gener...
research
02/20/2018

Using Semi-Supervised Learning for Predicting Metamorphic Relations

Software testing is difficult to automate, especially in programs which ...
research
05/29/2022

To test, or not to test: A proactive approach for deciding complete performance test initiation

Software performance testing requires a set of inputs that exercise diff...
research
09/30/2020

First-order Optimization for Superquantile-based Supervised Learning

Classical supervised learning via empirical risk (or negative log-likeli...
research
03/02/2023

Reasoning-Based Software Testing

With software systems becoming increasingly pervasive and autonomous, ou...
research
10/20/2022

LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness

Recent studies show that training deep neural networks (DNNs) with Lipsc...

Please sign up or login with your details

Forgot password? Click here to reset