Cookiescanner: An Automated Tool for Detecting and Evaluating GDPR Consent Notices on Websites

09/12/2023
by   Ralf Gundelach, et al.
0

The enforcement of the GDPR led to the widespread adoption of consent notices, colloquially known as cookie banners. Studies have shown that many website operators do not comply with the law and track users prior to any interaction with the consent notice, or attempt to trick users into giving consent through dark patterns. Previous research has relied on manually curated filter lists or automated detection methods limited to a subset of websites, making research on GDPR compliance of consent notices tedious or limited. We present cookiescanner, an automated scanning tool that detects and extracts consent notices via various methods and checks if they offer a decline option or use color diversion. We evaluated cookiescanner on a random sample of the top 10,000 websites listed by Tranco. We found that manually curated filter lists have the highest precision but recall fewer consent notices than our keyword-based methods. Our BERT model achieves high precision for English notices, which is in line with previous work, but suffers from low recall due to insufficient candidate extraction. While the automated detection of decline options proved to be challenging due to the dynamic nature of many sites, detecting instances of different colors of the buttons was successful in most cases. Besides systematically evaluating our various detection techniques, we have manually annotated 1,000 websites to provide a ground-truth baseline, which has not existed previously. Furthermore, we release our code and the annotated dataset in the interest of reproducibility and repeatability.

READ FULL TEXT
research
05/01/2020

On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach

Websites use third-party ads and tracking services to deliver targeted a...
research
03/07/2022

Blocked or Broken? Automatically Detecting When Privacy Interventions Break Websites

A core problem in the development and maintenance of crowd-sourced filte...
research
11/19/2017

A systematic framework to discover pattern for web spam classification

Web spam is a big problem for search engine users in World Wide Web. The...
research
09/09/2022

SSOPrivateEye: Timely Disclosure of Single Sign-On Privacy Design Differences

The number of login options on websites has increased since the introduc...
research
06/15/2021

Snail Mail Beats Email Any Day: On Effective Operator Security Notifications in the Internet

In the era of large-scale internet scanning, misconfigured websites are ...
research
07/13/2022

PhishSim: Aiding Phishing Website Detection with a Feature-Free Tool

In this paper, we propose a feature-free method for detecting phishing w...
research
04/20/2022

One-Class Model for Fabric Defect Detection

An automated and accurate fabric defect inspection system is in high dem...

Please sign up or login with your details

Forgot password? Click here to reset