On the Privacy of dK-Random Graphs
Real social network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. Previous research shows that many graph anonymization techniques fail against existing graph de-anonymization attacks. However, the specific reason for the success of such de-anonymization attacks is yet to be understood. This paper systematically studies the structural properties of real graphs that make them more vulnerable to machine learning-based techniques for de-anonymization. More precisely, we study the boundaries of anonymity based on the structural properties of real graph datasets in terms of how their dK-based anonymized versions resist (or fail) to various types of attacks. Our experimental results lead to three contributions. First, we identify the strength of an attacker based on the graph characteristics of the subset of nodes from which it starts the de-anonymization attack. Second, we quantify the relative effectiveness of dK-series for graph anonymization. And third, we identify the properties of the original graph that make it more vulnerable to de-anonymization.
READ FULL TEXT