Building K-Anonymous User Cohorts with Consecutive Consistent Weighted Sampling (CCWS)

04/26/2023
by   Xinyi Zheng, et al.
0

To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable K-anonymous cohort building algorithm called consecutive consistent weighted sampling (CCWS). The proposed method combines the spirit of the (p-powered) consistent weighted sampling and hierarchical clustering, so that the K-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of >70M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2023

Differentially Private One Permutation Hashing and Bin-wise Consistent Weighted Sampling

Minwise hashing (MinHash) is a standard algorithm widely used in the ind...
research
07/16/2021

DxHash: A Scalable Consistent Hash Based on the Pseudo-Random Sequence

Consistent hasing has played a fundamental role as a data router and a l...
research
01/07/2022

GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural Networks

We develop the "generalized consistent weighted sampling" (GCWS) for has...
research
08/16/2019

Toward an Attribute-Based Digital Identity Modeling for Privacy Preservation

Digital identity is a multidimensional, multidisciplinary, and a complex...
research
08/03/2020

Framework for a DLT Based COVID-19 Passport

Uniquely identifying individuals across the various networks they intera...
research
01/04/2019

Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

We evaluate the impact of probabilistically-constructed digital identity...

Please sign up or login with your details

Forgot password? Click here to reset