Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

10/01/2020
by   Yan Shvartzshnaider, et al.
0

This paper formulates a new task of extracting privacy parameters from a privacy policy, through the lens of Contextual Integrity, an established social theory framework for reasoning about privacy norms. Privacy policies, written by lawyers, are lengthy and often comprise incomplete and vague statements. In this paper, we show that traditional NLP tasks, including the recently proposed Question-Answering based solutions, are insufficient to address the privacy parameter extraction problem and provide poor precision and recall. We describe 4 different types of conventional methods that can be partially adapted to address the parameter extraction task with varying degrees of success: Hidden Markov Models, BERT fine-tuned models, Dependency Type Parsing (DP) and Semantic Role Labeling (SRL). Based on a detailed evaluation across 36 real-world privacy policies of major enterprises, we demonstrate that a solution combining syntactic DP coupled with type-specific SRL tasks provides the highest accuracy for retrieving contextual privacy parameters from privacy statements. We also observe that incorporating domain-specific knowledge is critical to achieving high precision and recall, thus inspiring new NLP research to address this important problem in the privacy domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2018

Analyzing Privacy Policies Using Contextual Integrity Annotations

In this paper, we demonstrate the effectiveness of using the theory of c...
research
06/16/2023

Pushing the Limits of ChatGPT on NLP Tasks

Despite the success of ChatGPT, its performances on most NLP tasks are s...
research
12/07/2022

A Study on Extracting Named Entities from Fine-tuned vs. Differentially Private Fine-tuned BERT Models

Privacy preserving deep learning is an emerging field in machine learnin...
research
10/13/2022

PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs

Privacy policies disclose how an organization collects and handles perso...
research
03/12/2019

Evaluating the Contextual Integrity of Privacy Regulation: Parents' IoT Toy Privacy Norms Versus COPPA

Increased concern about data privacy has prompted new and updated data p...
research
03/14/2022

Towards Semantic Search for Community Question Answering for Mortgage Officers

Community Question Answering (CQA) has gained increasing popularity in m...
research
11/07/2017

The VACCINE Framework for Building DLP Systems

Conventional Data Leakage Prevention (DLP) systems suffer from the follo...

Please sign up or login with your details

Forgot password? Click here to reset