Leveraging Extracted Model Adversaries for Improved Black Box Attacks

10/30/2020
by   Naveen Jafer Nizar, et al.
0

We present a method for adversarial input generation against black box models for reading comprehension based question answering. Our approach is composed of two steps. First, we approximate a victim black box model via model extraction (Krishna et al., 2020). Second, we use our own white box method to generate input perturbations that cause the approximate model to fail. These perturbed inputs are used against the victim. In experiments we find that our method improves on the efficacy of the AddAny—a white box attack—performed on the approximate model by 25 11

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2018

CAAD 2018: Powerful None-Access Black-Box Attack Based on Adversarial Transformation Network

In this paper, we propose an improvement of Adversarial Transformation N...
research
06/24/2022

Black Box Optimization Using QUBO and the Cross Entropy Method

Black box optimization (BBO) can be used to optimize functions whose ana...
research
06/05/2021

Extracting Weighted Automata for Approximate Minimization in Language Modelling

In this paper we study the approximate minimization problem for language...
research
12/03/2019

A Study of Black Box Adversarial Attacks in Computer Vision

Machine learning has seen tremendous advances in the past few years whic...
research
05/26/2021

The "given data" paradigm undermines both cultures

Breiman organizes "Statistical modeling: The two cultures" around a simp...
research
02/25/2022

On the Effectiveness of Dataset Watermarking in Adversarial Settings

In a data-driven world, datasets constitute a significant economic value...
research
02/20/2018

Out-distribution training confers robustness to deep neural networks

The easiness at which adversarial instances can be generated in deep neu...

Please sign up or login with your details

Forgot password? Click here to reset