A Benchmark Comparison of Python Malware Detection Approaches

09/27/2022
by   Duc-Ly Vu, et al.
0

While attackers often distribute malware to victims via open-source, community-driven package repositories, these repositories do not currently run automated malware detection systems. In this work, we explore the security goals of the repository administrators and the requirements for deployments of such malware scanners via a case study of the Python ecosystem and PyPI repository, which includes interviews with administrators and maintainers. Further, we evaluate existing malware detection techniques for deployment in this setting by creating a benchmark dataset and comparing several existing tools, including the malware checks implemented in PyPI, Bandit4Mal, and OSSGadget's OSS Detect Backdoor. We find that repository administrators have exacting technical demands for such malware detection tools. Specifically, they consider a false positive rate of even 0.01 releases that might trigger false alerts. Measured tools have false positive rates between 15 this rate renders the true positive rate useless. In some cases, these checks emitted alerts more often for benign packages than malicious ones. However, we also find a successful socio-technical malware detection system: external security researchers also perform repository malware scans and report the results to repository administrators. These parties face different incentives and constraints on their time and tooling. We conclude with recommendations for improving detection capabilities and strengthening the collaboration between security researchers and software repository administrators.

READ FULL TEXT
research
08/09/2021

Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints

The detection of malware is a critical task for the protection of comput...
research
01/04/2021

Echelon: Two-Tier Malware Detection for Raw Executables to Reduce False Alarms

Existing malware detection approaches suffer from a simplistic trade-off...
research
08/02/2016

Improving Zero-Day Malware Testing Methodology Using Statistically Significant Time-Lagged Test Samples

Enterprise networks are in constant danger of being breached by cyber-at...
research
09/06/2020

Automatic Yara Rule Generation Using Biclustering

Yara rules are a ubiquitous tool among cybersecurity practitioners and a...
research
02/24/2018

Toward an Evidence-based Design for Reactive Security Policies and Mechanisms

As malware, exploits, and cyber-attacks advance over time, so do the mit...
research
04/13/2022

Stealing Malware Classifiers and AVs at Low False Positive Conditions

Model stealing attacks have been successfully used in many machine learn...
research
11/14/2020

HackerScope: The Dynamics of a Massive Hacker Online Ecosystem

Authors of malicious software are not hiding as much as one would assume...

Please sign up or login with your details

Forgot password? Click here to reset