Log In Sign Up

Differentially Private Adversarial Robustness Through Randomized Perturbations

by   Nan Xu, et al.

Deep Neural Networks, despite their great success in diverse domains, are provably sensitive to small perturbations on correctly classified examples and lead to erroneous predictions. Recently, it was proposed that this behavior can be combatted by optimizing the worst case loss function over all possible substitutions of training examples. However, this can be prone to weighing unlikely substitutions higher, limiting the accuracy gain. In this paper, we study adversarial robustness through randomized perturbations, which has two immediate advantages: (1) by ensuring that substitution likelihood is weighted by the proximity to the original word, we circumvent optimizing the worst case guarantees and achieve performance gains; and (2) the calibrated randomness imparts differentially-private model training, which additionally improves robustness against adversarial attacks on the model outputs. Our approach uses a novel density-based mechanism based on truncated Gumbel noise, which ensures training on substitutions of both rare and dense words in the vocabulary while maintaining semantic similarity for model robustness.


page 1

page 2

page 3

page 4


Smoothed Analysis of Online and Differentially Private Learning

Practical and pervasive needs for robustness and privacy in algorithms h...

Differentially Private Deep Learning with Direct Feedback Alignment

Standard methods for differentially private training of deep neural netw...

Colored Noise Mechanism for Differentially Private Clustering

The goal of this paper is to propose and analyze a differentially privat...

RAB: Provable Robustness Against Backdoor Attacks

Recent studies have shown that deep neural networks (DNNs) are vulnerabl...

On Frank-Wolfe Optimization for Adversarial Robustness and Interpretability

Deep neural networks are easily fooled by small perturbations known as a...

Certifiable Distributional Robustness with Principled Adversarial Training

Neural networks are vulnerable to adversarial examples and researchers h...