PicoDomain: A Compact High-Fidelity Cybersecurity Dataset

08/20/2020
by   Craig Laprade, et al.
0

Analysis of cyber relevant data has become an area of increasing focus. As larger percentages of businesses and governments begin to understand the implications of cyberattacks, the impetus for better cybersecurity solutions has increased. Unfortunately, current cybersecurity datasets either offer no ground truth or do so with anonymized data. The former leads to a quandary when verifying results and the latter can remove valuable information. Additionally, most existing datasets are large enough to make them unwieldy during prototype development. In this paper we have developed the PicoDomain dataset, a compact high-fidelity collection of Zeek logs from a realistic intrusion using relevant Tools, Techniques, and Procedures. While simulated on a small-scale network, this dataset consists of traffic typical of an enterprise network, which can be utilized for rapid validation and iterative development of analytics platforms. We have validated this dataset using traditional statistical analysis and off-the-shelf Machine Learning techniques.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset