Blocked or Broken? Automatically Detecting When Privacy Interventions Break Websites

03/07/2022
by   Michael Smith, et al.
0

A core problem in the development and maintenance of crowd-sourced filter lists is that their maintainers cannot confidently predict whether (and where) a new filter list rule will break websites. This is a result of enormity of the Web, which prevents filter list authors from broadly understanding the impact of a new blocking rule before they ship it to millions of users. The inability of filter list authors to evaluate the Web compatibility impact of a new rule before shipping it severely reduces the benefits of filter-list-based content blocking: filter lists are both overly-conservative (i.e. rules are tailored narrowly to reduce the risk of breaking things) and error-prone (i.e. blocking tools still break large numbers of sites). To scale to the size and scope of the Web, filter list authors need an automated system to detect when a new filter rule breaks websites, before that breakage has a chance to make it to end users. In this work, we design and implement the first automated system for predicting when a filter list rule breaks a website. We build a classifier, trained on a dataset generated by a combination of compatibility data from the EasyList project and novel browser instrumentation, and find it is accurate to practical levels (AUC 0.88). Our open source system requires no human interaction when assessing the compatibility risk of a proposed privacy intervention. We also present the 40 page behaviors that most predict breakage in observed websites.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2019

Filter List Generation for Underserved Regions

Filter lists play a large and growing role in protecting and assisting w...
research
02/25/2022

AutoFR: Automated Filter Rule Generation for Adblocking

Adblocking relies on filter lists, which are manually curated and mainta...
research
12/12/2019

Investigating the effectiveness of web adblockers

We investigate adblocking filters and the extent to which websites and a...
research
10/22/2018

Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking

Ad and tracking blocking extensions are among the most popular browser e...
research
11/02/2020

There's No Trick, Its Just a Simple Trick: A Web-Compat and Privacy Improving Approach to Third-party Web Storage

While much current web privacy research focuses on browser fingerprintin...
research
09/12/2023

Cookiescanner: An Automated Tool for Detecting and Evaluating GDPR Consent Notices on Websites

The enforcement of the GDPR led to the widespread adoption of consent no...
research
05/25/2020

Improving Web Content Blocking With Event-Loop-Turn Granularity JavaScript Signatures

Content blocking is an important part of a performant, user-serving, pri...

Please sign up or login with your details

Forgot password? Click here to reset