Understanding and Benchmarking the Impact of GDPR on Database Systems

10/02/2019
by   Supreeth Shastri, et al.
0

The General Data Protection Regulation (GDPR) was introduced in Europe to offer new rights and protections to people concerning their personal data. We investigate GDPR from a database systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify the new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge how ready the modern database systems are for GDPR, we modify Redis and PostgreSQL to be GDPR compliant. Our evaluations show that this modification degrades their performance by up to 5x. Our results also demonstrate that the current database systems are two to four orders of magnitude worse in supporting GDPR workloads compared to traditional workloads (such as YCSB), and also do not scale as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset