Disrupting Deepfakes with an Adversarial Attack that Survives Training

06/17/2020
by   Eran Segalis, et al.
0

The rapid progress in generative models and autoencoders has given rise to effective video tampering techniques, used for generating deepfakes. Mitigation research is mostly focused on post-factum deepfake detection and not prevention. We complement these efforts by proposing a prevention technique against face-swapping autoencoders. Our technique consists of a novel training-resistant adversarial attack that can be applied to a video to disrupt face-swapping manipulations. Our attack introduces spatial-temporal distortions to the output of the face-swapping autoencoders, and it holds whether or not our adversarial images have been included in the training set of said autoencoders. To implement the attack, we construct a bilevel optimization problem, where we train a generator and a face-swapping model instance against each other. Specifically, we pair each input image with a target distortion, and feed them into a generator that produces an adversarial image. This image will exhibit the distortion when a face-swapping autoencoder is applied to it. We solve the optimization problem by training the generator and the face-swapping model simultaneously using an iterative process of alternating optimization. Finally, we validate our attack using a popular implementation of FaceSwap, and show that our attack transfers across different models and target faces. More broadly, these results demonstrate the existence of training-resistant adversarial attacks, potentially applicable to a wide range of domains.

READ FULL TEXT
research
12/01/2016

Adversarial Images for Variational Autoencoders

We investigate adversarial attacks for autoencoders. We propose a proced...
research
05/31/2018

Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization

Adversarial attacks involve adding, small, often imperceptible, perturba...
research
04/09/2020

Adversarial Latent Autoencoders

Autoencoder networks are unsupervised approaches aiming at combining gen...
research
12/22/2017

Using LIP to Gloss Over Faces in Single-Stage Face Detection Networks

This work shows that it is possible to fool/attack recent state-of-the-a...
research
10/06/2020

BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

The tremendous progress of autoencoders and generative adversarial netwo...
research
03/04/2020

Double Backpropagation for Training Autoencoders against Adversarial Attack

Deep learning, as widely known, is vulnerable to adversarial samples. Th...
research
12/08/2017

CycleGAN: a Master of Steganography

CycleGAN is one of the latest successful approaches to learn a correspon...

Please sign up or login with your details

Forgot password? Click here to reset