PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs

10/13/2022
by   Hao Cui, et al.
0

Privacy policies disclose how an organization collects and handles personal information. Recent work has made progress in leveraging natural language processing (NLP) to automate privacy policy analysis and extract collection statements from different sentences, considered in isolation from each other. In this paper, we view and analyze, for the first time, the entire text of a privacy policy in an integrated way. In terms of methodology: (1) we define PoliGraph, a type of knowledge graph that captures different relations between different parts of the text in a privacy policy; and (2) we develop an NLP-based tool, PoliGraph-er, to automatically extract PoliGraph from the text. In addition, (3) we revisit the notion of ontologies, previously defined in heuristic ways, to capture subsumption relations between terms. We make a clear distinction between local and global ontologies to capture the context of individual privacy policies, application domains, and privacy laws. Using a public dataset for evaluation, we show that PoliGraph-er identifies 61 collection statements than prior state-of-the-art, with over 90 terms of applications, PoliGraph enables automated analysis of a corpus of privacy policies and allows us to: (1) reveal common patterns in the texts across different privacy policies, and (2) assess the correctness of the terms as defined within a privacy policy. We also apply PoliGraph to: (3) detect contradictions in a privacy policy-we show false positives by prior work, and (4) analyze the consistency of privacy policies and network traffic, where we identify significantly more clear disclosures than prior work.

READ FULL TEXT
research
03/14/2019

Analysis of Privacy Policies to Enhance Informed Consent

In this report, we present an approach to enhance informed consent for t...
research
03/14/2019

Analysis of Privacy Policies to Enhance Informed Consent (Extended Version)

In this report, we present an approach to enhance informed consent for t...
research
08/20/2020

Privacy Policies over Time: Curation and Analysis of a Million-Document Dataset

Automated analysis of privacy policies has proved a fruitful research di...
research
09/06/2018

Analyzing Privacy Policies Using Contextual Integrity Annotations

In this paper, we demonstrate the effectiveness of using the theory of c...
research
10/01/2020

Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

This paper formulates a new task of extracting privacy parameters from a...
research
07/17/2018

Power Networks: A Novel Neural Architecture to Predict Power Relations

Can language analysis reveal the underlying social power relations that ...
research
11/23/2021

Identifying Terms and Conditions Important to Consumers using Crowdsourcing

Terms and conditions (T Cs) are pervasive on the web and often contain...

Please sign up or login with your details

Forgot password? Click here to reset