Practical Automated Detection of Malicious npm Packages

02/28/2022
by   Adriana Sejfia, et al.
0

The npm registry is one of the pillars of the JavaScript and TypeScript ecosystems, hosting over 1.7 million packages ranging from simple utility libraries to complex frameworks and entire applications. Due to the overwhelming popularity of npm, it has become a prime target for malicious actors, who publish new packages or compromise existing packages to introduce malware that tampers with or exfiltrates sensitive data from users who install either these packages or any package that (transitively) depends on them. Defending against such attacks is essential to maintaining the integrity of the software supply chain, but the sheer volume of package updates makes comprehensive manual review infeasible. We present Amalfi, a machine-learning based approach for automatically detecting potentially malicious packages comprised of three complementary techniques. We start with classifiers trained on known examples of malicious and benign packages. If a package is flagged as malicious by a classifier, we then check whether it includes metadata about its source repository, and if so whether the package can be reproduced from its source code. Packages that are reproducible from source are not usually malicious, so this step allows us to weed out false positives. Finally, we also employ a simple textual clone-detection technique to identify copies of malicious packages that may have been missed by the classifiers, reducing the number of false negatives. Amalfi improves on the state of the art in that it is lightweight, requiring only a few seconds per package to extract features and run the classifiers, and gives good results in practice: running it on 96287 package versions published over the course of one week, we were able to identify 95 previously unknown malware samples, with a manageable number of false positives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2021

Containing Malicious Package Updates in npm with a Lightweight Permission System

The large amount of third-party packages available in fast-moving softwa...
research
05/31/2023

You Can Run But You Can't Hide: Runtime Protection Against Malicious Package Updates For Node.js

Maliciously prepared software packages are an extensively leveraged weap...
research
01/05/2021

A practical approach for updating an integrity-enforced operating system

Trusted computing defines how to securely measure, store, and verify the...
research
03/05/2021

Anomalicious: Automated Detection of Anomalous and Potentially Malicious Commits on GitHub

Security is critical to the adoption of open source software (OSS), yet ...
research
08/24/2023

npm-follower: A Complete Dataset Tracking the NPM Ecosystem

Software developers typically rely upon a large network of dependencies ...
research
09/16/2022

Malicious Source Code Detection Using Transformer

Open source code is considered a common practice in modern software deve...

Please sign up or login with your details

Forgot password? Click here to reset