Evading Deepfake-Image Detectors with White- and Black-Box Attacks

04/01/2020
by   Nicholas Carlini, et al.
4

It is now possible to synthesize highly realistic images of people who don't exist. Such content has, for example, been implicated in the creation of fraudulent social-media profiles responsible for dis-information campaigns. Significant efforts are, therefore, being deployed to detect synthetically-generated content. One popular forensic approach trains a neural network to distinguish real from synthetic content. We show that such forensic classifiers are vulnerable to a range of attacks that reduce the classifier to near-0 studies on a state-of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators, when only trained on one generator. With full access to the classifier, we can flip the lowest bit of each pixel in an image to reduce the classifier's AUC to 0.0005; perturb 1 noise pattern in the synthesizer's latent space to reduce the classifier's AUC to 0.17. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22. These attacks reveal significant vulnerabilities of certain image-forensic classifiers.

READ FULL TEXT

page 1

page 6

page 8

research
04/23/2018

Black-box Adversarial Attacks with Limited Queries and Information

Current neural network-based classifiers are susceptible to adversarial ...
research
07/28/2022

Exploiting and Defending Against the Approximate Linearity of Apple's NeuralHash

Perceptual hashes map images with identical semantic content to the same...
research
04/05/2022

FaceSigns: Semi-Fragile Neural Watermarks for Media Authentication and Countering Deepfakes

Deepfakes and manipulated media are becoming a prominent threat due to t...
research
07/11/2020

ManiGen: A Manifold Aided Black-box Generator of Adversarial Examples

Machine learning models, especially neural network (NN) classifiers, hav...
research
04/15/2020

Advanced Evasion Attacks and Mitigations on Practical ML-Based Phishing Website Classifiers

Machine learning (ML) based approaches have been the mainstream solution...
research
06/13/2020

Defensive Approximation: Enhancing CNNs Security through Approximate Computing

In the past few years, an increasing number of machine-learning and deep...
research
06/30/2019

Fooling a Real Car with Adversarial Traffic Signs

The attacks on the neural-network-based classifiers using adversarial im...

Please sign up or login with your details

Forgot password? Click here to reset