Scalable Discovery and Continuous Inventory of Personal Data at Rest in Cloud Native Systems

09/09/2022
by   Elias Grünewald, et al.
0

Cloud native systems are processing large amounts of personal data through numerous and possibly multi-paradigmatic data stores (e.g., relational and non-relational databases). From a privacy engineering perspective, a core challenge is to keep track of all exact locations, where personal data is being stored, as required by regulatory frameworks such as the European General Data Protection Regulation. In this paper, we present Teiresias, comprising i) a workflow pattern for scalable discovery of personal data at rest, and ii) a cloud native system architecture and open source prototype implementation of said workflow pattern. To this end, we enable a continuous inventory of personal data featuring transparency and accountability following DevOps/DevPrivOps practices. In particular, we scope version-controlled Infrastructure as Code definitions, cloud-based storages, and how to integrate the process into CI/CD pipelines. Thereafter, we provide iii) a comparative performance evaluation demonstrating both appropriate execution times for real-world settings, and a promising personal data detection accuracy outperforming existing proprietary tools in public clouds.

READ FULL TEXT
research
07/04/2022

KubeAdaptor: A Docking Framework for Workflow Containerization on Kubernetes

As Kubernetes becomes the infrastructure of the cloud-native era, the in...
research
06/04/2023

Hawk: DevOps-driven Transparency and Accountability in Cloud Native Systems

Transparency is one of the most important principles of modern privacy r...
research
08/02/2021

Cloud Native Privacy Engineering through DevPrivOps

Cloud native information systems engineering enables scalable and resili...
research
01/04/2023

Identifying Personal Data Processing for Code Review

Code review is a critical step in the software development life cycle, w...
research
10/30/2019

Forgotten @ Scale: A Practical Solution for Implementing the Right To Be Forgotten in Large-Scale Systems

The European General Data Protection Regulation asserts data subjects' r...
research
07/15/2020

SRv6-PM: Performance Monitoring of SRv6 Networks with a Cloud-Native Architecture

Segment Routing over IPv6 (SRv6 in short) is a networking solution for I...
research
03/12/2021

Performance Exploration of Virtualization Systems

Virtualization has gained astonishing popularity in recent decades. It i...

Please sign up or login with your details

Forgot password? Click here to reset