Distribution inference risks: Identifying and mitigating sources of leakage

09/18/2022
by   Valentin Hartmann, et al.
0

A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on demonstrating successful attacks, with little attention given to identifying the potential causes of the leakage and to proposing mitigations. To bridge this gap, as our main contribution, we theoretically and empirically analyze the sources of information leakage that allows an adversary to perpetrate distribution inference attacks. We identify three sources of leakage: (1) memorizing specific information about the 𝔼[Y|X] (expected label given the feature values) of interest to the adversary, (2) wrong inductive bias of the model, and (3) finiteness of the training data. Next, based on our analysis, we propose principled mitigation techniques against distribution inference attacks. Specifically, we demonstrate that causal learning techniques are more resilient to a particular type of distribution inference risk termed distributional membership inference than associative learning methods. And lastly, we present a formalization of distribution inference that allows for reasoning about more general adversaries than was previously possible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2021

Property Inference From Poisoning

Property inference attacks consider an adversary who has access to the t...
research
09/17/2020

On Primes, Log-Loss Scores and (No) Privacy

Membership Inference Attacks exploit the vulnerabilities of exposing mod...
research
11/10/2022

On the Privacy Risks of Algorithmic Recourse

As predictive models are increasingly being employed to make consequenti...
research
05/29/2019

Ultimate Power of Inference Attacks: Privacy Risks of High-Dimensional Models

Models leak information about their training data. This enables attacker...
research
06/07/2021

Formalizing Distribution Inference Risks

Property inference attacks reveal statistical properties about a trainin...
research
01/28/2021

An Analysis Of Protected Health Information Leakage In Deep-Learning Based De-Identification Algorithms

The increasing complexity of algorithms for analyzing medical data, incl...
research
09/05/2023

The Adversarial Implications of Variable-Time Inference

Machine learning (ML) models are known to be vulnerable to a number of a...

Please sign up or login with your details

Forgot password? Click here to reset