MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

10/03/2022
by   Jinyuan Jia, et al.
0

Multi-label classification, which predicts a set of labels for an input, has many applications. However, multiple recent studies showed that multi-label classification is vulnerable to adversarial examples. In particular, an attacker can manipulate the labels predicted by a multi-label classifier for an input via adding carefully crafted, human-imperceptible perturbation to it. Existing provable defenses for multi-class classification achieve sub-optimal provable robustness guarantees when generalized to multi-label classification. In this work, we propose MultiGuard, the first provably robust defense against adversarial examples to multi-label classification. Our MultiGuard leverages randomized smoothing, which is the state-of-the-art technique to build provably robust classifiers. Specifically, given an arbitrary multi-label classifier, our MultiGuard builds a smoothed multi-label classifier via adding random noise to the input. We consider isotropic Gaussian noise in this work. Our major theoretical contribution is that we show a certain number of ground truth labels of an input are provably in the set of labels predicted by our MultiGuard when the ℓ_2-norm of the adversarial perturbation added to the input is bounded. Moreover, we design an algorithm to compute our provable robustness guarantees. Empirically, we evaluate our MultiGuard on VOC 2007, MS-COCO, and NUS-WIDE benchmark datasets. Our code is available at: <https://github.com/quwenjie/MultiGuard>

READ FULL TEXT

page 9

page 21

research
03/04/2021

PointGuard: Provably Robust 3D Point Cloud Classification

3D point cloud classification has many safety-critical applications such...
research
09/13/2022

Class-Level Logit Perturbation

Features, logits, and labels are the three primary data when a sample pa...
research
02/11/2022

Predicting Out-of-Distribution Error with the Projection Norm

We propose a metric – Projection Norm – to predict a model's performance...
research
07/31/2021

T_kML-AP: Adversarial Attacks to Top-k Multi-Label Learning

Top-k multi-label learning, which returns the top-k predicted labels fro...
research
02/07/2020

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

Machine learning algorithms are known to be susceptible to data poisonin...
research
11/02/2017

Provable defenses against adversarial examples via the convex outer adversarial polytope

We propose a method to learn deep ReLU-based classifiers that are provab...
research
08/29/2022

Reducing Certified Regression to Certified Classification

Adversarial training instances can severely distort a model's behavior. ...

Please sign up or login with your details

Forgot password? Click here to reset