Improving Zero-Day Malware Testing Methodology Using Statistically Significant Time-Lagged Test Samples

08/02/2016
by   Konstantin Berlin, et al.
0

Enterprise networks are in constant danger of being breached by cyber-attackers, but making the decision about what security tools to deploy to mitigate this risk requires carefully designed evaluation of security products. One of the most important metrics for a protection product is how well it is able to stop malware, specifically on "zero"-day malware that has not been seen by the security community before. However, evaluating zero-day performance is difficult, because of larger number of previously unseen samples that are needed to properly measure the true and false positive rate, and the challenges involved in accurately labeling these samples. This paper addresses these issues from a statistical and practical perspective. Our contributions include first showing that the number of benign files needed for proper evaluation is on the order of a millions, and the number of malware samples needed is on the order of tens of thousands. We then propose and justify a time-delay method for easily collecting large number of previously unseen, but labeled, samples. This enables cheap and accurate evaluation of zero-day true and false positive rates. Finally, we propose a more fine-grain labeling of the malware/benignware in order to better model the heterogeneous distribution of files on various networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2019

The Performance of Machine and Deep Learning Classifiers in Detecting Zero-Day Vulnerabilities

The detection of zero-day attacks and vulnerabilities is a challenging p...
research
09/27/2022

A Benchmark Comparison of Python Malware Detection Approaches

While attackers often distribute malware to victims via open-source, com...
research
12/16/2020

Beyond the Hype: A Real-World Evaluation of the Impact and Cost of Machine Learning–Based Malware Detection

There is a lack of scientific testing of commercially available malware ...
research
05/30/2019

An Efficient Detection of Malware by Naive Bayes Classifier Using GPGPU

Due to continuous increase in the number of malware (according to AV-Tes...
research
04/13/2022

Stealing Malware Classifiers and AVs at Low False Positive Conditions

Model stealing attacks have been successfully used in many machine learn...
research
09/06/2020

Automatic Yara Rule Generation Using Biclustering

Yara rules are a ubiquitous tool among cybersecurity practitioners and a...
research
08/28/2023

AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors

This work presents an evaluation of six prominent commercial endpoint ma...

Please sign up or login with your details

Forgot password? Click here to reset