Masked LARk: Masked Learning, Aggregation and Reporting worKflow

10/27/2021
by   Joseph J. Pfeiffer III, et al.
0

Today, many web advertising data flows involve passive cross-site tracking of users. Enabling such a mechanism through the usage of third party tracking cookies (3PC) exposes sensitive user data to a large number of parties, with little oversight on how that data can be used. Thus, most browsers are moving towards removal of 3PC in subsequent browser iterations. In order to substantially improve end-user privacy while allowing sites to continue to sustain their business through ad funding, new privacy-preserving primitives need to be introduced. In this paper, we discuss a new proposal, called Masked LARk, for aggregation of user engagement measurement and model training that prevents cross-site tracking, while remaining (a) flexible, for engineering development and maintenance, (b) secure, in the sense that cross-site tracking and tracing are blocked and (c) open for continued model development and training, allowing advertisers to serve relevant ads to interested users. We introduce a secure multi-party compute (MPC) protocol that utilizes "helper" parties to train models, so that once data leaves the browser, no downstream system can individually construct a complete picture of the user activity. For training, our key innovation is through the usage of masking, or the obfuscation of the true labels, while still allowing a gradient to be accurately computed in aggregate over a batch of data. Our protocol only utilizes light cryptography, at such a level that an interested yet inexperienced reader can understand the core algorithm. We develop helper endpoints that implement this system, and give example usage of training in PyTorch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2020

Information Leaks via Safari's Intelligent Tracking Prevention

Intelligent Tracking Prevention (ITP) is a privacy mechanism implemented...
research
08/29/2023

Understanding the Privacy Risks of Popular Search Engine Advertising Systems

We present the first extensive measurement of the privacy properties of ...
research
11/26/2018

Distributed and Secure ML with Self-tallying Multi-party Aggregation

Privacy preserving multi-party computation has many applications in area...
research
01/31/2022

Privacy Limitations Of Interest-based Advertising On The Web: A Post-mortem Empirical Analysis Of Google's FLoC

In 2020, Google announced they would disable third-party cookies in the ...
research
05/09/2019

Enhanced Performance and Privacy for TLS over TCP Fast Open

Small TCP flows make up the majority of web flows. For them, the TCP thr...
research
03/18/2022

Trackers Bounce Back: Measuring Evasion of Partitioned Storage in the Wild

This work presents a systematic study of navigational tracking, the late...
research
06/27/2019

Data Consortia

Today, web-based companies use user data to provide and enhance services...

Please sign up or login with your details

Forgot password? Click here to reset