Extracting Cloud-based Model with Prior Knowledge

06/07/2023
by   Shiqian Zhao, et al.
0

Machine Learning-as-a-Service, a pay-as-you-go business pattern, is widely accepted by third-party users and developers. However, the open inference APIs may be utilized by malicious customers to conduct model extraction attacks, i.e., attackers can replicate a cloud-based black-box model merely via querying malicious examples. Existing model extraction attacks mainly depend on the posterior knowledge (i.e., predictions of query samples) from Oracle. Thus, they either require high query overhead to simulate the decision boundary, or suffer from generalization errors and overfitting problems due to query budget limitations. To mitigate it, this work proposes an efficient model extraction attack based on prior knowledge for the first time. The insight is that prior knowledge of unlabeled proxy datasets is conducive to the search for the decision boundary (e.g., informative samples). Specifically, we leverage self-supervised learning including autoencoder and contrastive learning to pre-compile the prior knowledge of the proxy dataset into the feature extractor of the substitute model. Then we adopt entropy to measure and sample the most informative examples to query the target model. Our design leverages both prior and posterior knowledge to extract the model and thus eliminates generalizability errors and overfitting problems. We conduct extensive experiments on open APIs like Traffic Recognition, Flower Recognition, Moderation Recognition, and NSFW Recognition from real-world platforms, Azure and Clarifai. The experimental results demonstrate the effectiveness and efficiency of our attack. For example, our attack achieves 95.1 merely 1.8K queries (cost 2.16) on the NSFW Recognition API. Also, the adversarial examples generated with our substitute model have better transferability than others, which reveals that our scheme is more conducive to downstream attacks.

READ FULL TEXT

page 6

page 11

research
01/04/2021

Local Black-box Adversarial Attacks: A Query Efficient Approach

Adversarial attacks have threatened the application of deep neural netwo...
research
08/24/2022

Unrestricted Black-box Adversarial Attack Using GAN with Limited Queries

Adversarial examples are inputs intentionally generated for fooling a de...
research
02/16/2023

Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data

We study black-box model stealing attacks where the attacker can query a...
research
01/23/2022

Increasing the Cost of Model Extraction with Calibrated Proof of Work

In model extraction attacks, adversaries can steal a machine learning mo...
research
07/05/2022

Query-Efficient Adversarial Attack Based on Latin Hypercube Sampling

In order to be applicable in real-world scenario, Boundary Attacks (BAs)...
research
05/13/2022

DualCF: Efficient Model Extraction Attack from Counterfactual Explanations

Cloud service providers have launched Machine-Learning-as-a-Service (MLa...
research
09/19/2023

Model Leeching: An Extraction Attack Targeting LLMs

Model Leeching is a novel extraction attack targeting Large Language Mod...

Please sign up or login with your details

Forgot password? Click here to reset