MEGA: Model Stealing via Collaborative Generator-Substitute Networks

01/31/2022
by   Chi Hong, et al.
0

Deep machine learning models are increasingly deployedin the wild for providing services to users. Adversaries maysteal the knowledge of these valuable models by trainingsubstitute models according to the inference results of thetargeted deployed models. Recent data-free model stealingmethods are shown effective to extract the knowledge of thetarget model without using real query examples, but they as-sume rich inference information, e.g., class probabilities andlogits. However, they are all based on competing generator-substitute networks and hence encounter training instability.In this paper we propose a data-free model stealing frame-work,MEGA, which is based on collaborative generator-substitute networks and only requires the target model toprovide label prediction for synthetic query examples. Thecore of our method is a model stealing optimization con-sisting of two collaborative models (i) the substitute modelwhich imitates the target model through the synthetic queryexamples and their inferred labels and (ii) the generatorwhich synthesizes images such that the confidence of thesubstitute model over each query example is maximized. Wepropose a novel coordinate descent training procedure andanalyze its convergence. We also empirically evaluate thetrained substitute model on three datasets and its applicationon black-box adversarial attacks. Our results show that theaccuracy of our trained substitute model and the adversarialattack success rate over it can be up to 33 state-of-the-art data-free black-box attacks.

READ FULL TEXT

page 13

page 14

research
04/10/2023

Certifiable Black-Box Attack: Ensuring Provably Successful Attack for Adversarial Examples

Black-box adversarial attacks have shown strong potential to subvert mac...
research
04/23/2022

Towards Data-Free Model Stealing in a Hard Label Setting

Machine learning models deployed as a service (MLaaS) are susceptible to...
research
07/13/2020

Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes

We focus on the problem of black-box adversarial attacks, where the aim ...
research
03/28/2020

DaST: Data-free Substitute Training for Adversarial Attacks

Machine learning models are vulnerable to adversarial examples. For the ...
research
05/06/2020

MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation

Model Stealing (MS) attacks allow an adversary with black-box access to ...
research
11/20/2017

Model Extraction Warning in MLaaS Paradigm

Cloud vendors are increasingly offering machine learning services as par...
research
05/31/2022

Concept-level Debugging of Part-Prototype Networks

Part-prototype Networks (ProtoPNets) are concept-based classifiers desig...

Please sign up or login with your details

Forgot password? Click here to reset