Continuous Learning for Android Malware Detection

02/08/2023
by   Yizheng Chen, et al.
0

Machine learning methods can detect Android malware with very high accuracy. However, these classifiers have an Achilles heel, concept drift: they rapidly become out of date and ineffective, due to the evolution of malware apps and benign apps. Our research finds that, after training an Android malware classifier on one year's worth of data, the F1 score quickly dropped from 0.99 to 0.76 after 6 months of deployment on new test samples. In this paper, we propose new methods to combat the concept drift problem of Android malware classifiers. Since machine learning technique needs to be continuously deployed, we use active learning: we select new samples for analysts to label, and then add the labeled samples to the training set to retrain the classifier. Our key idea is, similarity-based uncertainty is more robust against concept drift. Therefore, we combine contrastive learning with active learning. We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier. Our evaluation shows that this leads to significant improvements, compared to previously published methods for active learning. Our approach reduces the false negative rate from 16 maintaining the same false positive rate (0.6 more consistent performance across a seven-year time period than past methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2018

Stimulation and Detection of Android Repackaged Malware with Active Learning

Repackaging is a technique that has been increasingly adopted by authors...
research
09/18/2023

Efficient Concept Drift Handling for Batch Android Malware Detection Models

The rapidly evolving nature of Android apps poses a significant challeng...
research
04/19/2017

Semi-supervised classification for dynamic Android malware detection

A growing number of threats to Android phones creates challenges for mal...
research
01/20/2022

Android Malware Detection using Feature Ranking of Permissions

We investigate the use of Android permissions as the vehicle to allow fo...
research
08/09/2022

Robust Machine Learning for Malware Detection over Time

The presence and persistence of Android malware is an on-going threat th...
research
11/06/2017

Computer activity learning from system call time series

Using a previously introduced similarity function for the stream of syst...
research
08/21/2023

Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Despite the promising results of machine learning models in malware dete...

Please sign up or login with your details

Forgot password? Click here to reset