TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time

07/20/2018
by   Feargus Pendlebury, et al.
0

Academic research on machine learning-based malware classification appears to leave very little room for improvement, boasting F_1 performance figures of up to 0.99. Is the problem solved? In this paper, we argue that there is an endemic issue of inflated results due to two pervasive sources of experimental bias: spatial bias is caused by distributions of training and testing data not representative of a real-world deployment; temporal bias is caused by incorrect splits of training and testing sets (e.g., in cross-validation) leading to impossible configurations. To overcome this issue, we propose a set of space and time constraints for experiment design. Furthermore, we introduce a new metric that summarizes the performance of a classifier over time, i.e., its expected robustness in a real-world setting. Finally, we present an algorithm to tune the performance of a given classifier. We have implemented our solutions in TESSERACT, an open source evaluation framework that allows a fair comparison of malware classifiers in a realistic setting. We used TESSERACT to evaluate two well-known malware classifiers from the literature on a dataset of 129K applications, demonstrating the distortion of results due to experimental bias and showcasing significant improvements from tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2023

A Robust Classifier Under Missing-Not-At-Random Sample Selection Bias

The shift between the training and testing distributions is commonly due...
research
05/31/2022

Dataset Bias in Android Malware Detection

Researchers have proposed kinds of malware detection methods to solve th...
research
09/02/2022

Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?

Machine learning (ML)-based Android malware detection has been one of th...
research
05/02/2023

CNS-Net: Conservative Novelty Synthesizing Network for Malware Recognition in an Open-set Scenario

We study the challenging task of malware recognition on both known and n...
research
10/08/2020

Transcending Transcend: Revisiting Malware Classification with Conformal Evaluation

Machine learning for malware classification shows encouraging results, b...
research
07/15/2020

Experimental Design for Bathymetry Editing

We describe an application of machine learning to a real-world computer ...
research
05/09/2023

The Day-After-Tomorrow: On the Performance of Radio Fingerprinting over Time

The performance of Radio Frequency (RF) fingerprinting techniques is neg...

Please sign up or login with your details

Forgot password? Click here to reset