Adversarial Examples for Models of Code

10/15/2019
by   Noam Yefet, et al.
0

We introduce a novel approach for attacking trained models of code with adversarial examples. The main idea is to force a given trained model to make a prediction of the adversary's choice by introducing small perturbations that do not change program semantics. We find these perturbations by deriving the desired prediction with respect to the model's inputs while holding the model weights constant and following the gradients to slightly modify the input. To defend a model against such attacks, we propose placing a defensive model in front of the downstream model. The defensive model detects unlikely mutations and masks them before feeding the input to the downstream model. We show that our attack succeeds in changing a prediction to the adversary's desire ("targeted attack") up to 89 given prediction to any incorrect prediction ("non-targeted attack") 94 times. By using our proposed defense, the success rate of the attack drops drastically for both targeted and non-targeted attacks, with a minor penalty of 2

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2019

A New Ensemble Method for Concessively Targeted Multi-model Attack

It is well known that deep learning models are vulnerable to adversarial...
research
08/29/2019

Universal, transferable and targeted adversarial attacks

Deep Neural Network has been found vulnerable in many previous works. A ...
research
02/01/2019

The Efficacy of SHIELD under Different Threat Models

We study the efficacy of SHIELD in the face of alternative threat models...
research
05/24/2023

Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup

Deep neural networks are widely known to be susceptible to adversarial e...
research
09/21/2018

Adversarial Binaries for Authorship Identification

Binary code authorship identification determines authors of a binary pro...
research
05/13/2020

Adversarial examples are useful too!

Deep learning has come a long way and has enjoyed an unprecedented succe...
research
08/14/2022

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

A powerful category of data poisoning attacks modify a subset of trainin...

Please sign up or login with your details

Forgot password? Click here to reset