Model Reconstruction from Model Explanations

07/13/2018
by   Smitha Milli, et al.
12

We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.

READ FULL TEXT

page 24

page 25

page 26

research
06/29/2022

Private Graph Extraction via Feature Explanations

Privacy and interpretability are two of the important ingredients for ac...
research
03/16/2023

Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles

The ability to explain why a machine learning model arrives at a particu...
research
06/09/2022

A Learning-Theoretic Framework for Certified Auditing of Machine Learning Models

Responsible use of machine learning requires that models be audited for ...
research
11/01/2021

Provably efficient, succinct, and precise explanations

We consider the problem of explaining the predictions of an arbitrary bl...
research
05/22/2022

Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated Gradient-based Localization

Neural networks are becoming increasingly better at tasks that involve c...
research
10/01/2021

The Cognitive Science of Extremist Ideologies Online

Extremist ideologies are finding new homes in online forums. These serve...
research
07/15/2021

FastSHAP: Real-Time Shapley Value Estimation

Shapley values are widely used to explain black-box models, but they are...

Please sign up or login with your details

Forgot password? Click here to reset