Large-Scale-Exploit of GitHub Repository Metadata and Preventive Measures

by   David Knothe, et al.

When working with Git, a popular version-control system, email addresses are part of the metadata for each individual commit. When those commits are pushed to remote hosting services like GitHub, those email addresses become visible not only to fellow developers, but also to malicious actors aiming to exploit them. As a part of our research we created a tool that leverages the publicly available GitHub API to collect user data. Analysis of this data not only gives access to millions of email addresses in very little time, but is also powerful and dense enough to create targeted phishing attacks posing a great threat to all GitHub users and their private, potentially sensitive data. Even worse, existing countermeasures fail to effectively protect against such exploits. As a consequence and main conclusion of this paper, we suggest multiple preventive measures that should be implemented as soon as possible. We also consider it the duty of both companies like GitHub and well informed software engineers to inform fellow developers about the risk of exposing private email addresses in Git commits published publicly.



There are no comments yet.


page 1

page 2

page 3

page 4


Github Data Exposure and Accessing Blocked Data using the GraphQL Security Design Flaw

This research study was conducted to illustrate how it is easily possibl...

Please Forget Where I Was Last Summer: The Privacy Risks of Public Location (Meta)Data

The exposure of location data constitutes a significant privacy risk to ...

Anomalicious: Automated Detection of Anomalous and Potentially Malicious Commits on GitHub

Security is critical to the adoption of open source software (OSS), yet ...

Precise XSS detection and mitigation with Client-side Templates

We present XSnare, a fully client-side XSS solution, implemented as a Fi...

Extensible Data Skipping

Data skipping reduces I/O for SQL queries by skipping over irrelevant da...

Software Citation in Theory and Practice

In most fields, computational models and data analysis have become a sig...

Data Privacy in Trigger-Action IoT Systems

Trigger-action platforms (TAPs) allow users to connect independent IoT o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.