Efficient Concept Drift Handling for Batch Android Malware Detection Models

09/18/2023
by   Molina-Coronado B., et al.
0

The rapidly evolving nature of Android apps poses a significant challenge to static batch machine learning algorithms employed in malware detection systems, as they quickly become obsolete. Despite this challenge, the existing literature pays limited attention to addressing this issue, with many advanced Android malware detection approaches, such as Drebin, DroidDet and MaMaDroid, relying on static models. In this work, we show how retraining techniques are able to maintain detector capabilities over time. Particularly, we analyze the effect of two aspects in the efficiency and performance of the detectors: 1) the frequency with which the models are retrained, and 2) the data used for retraining. In the first experiment, we compare periodic retraining with a more advanced concept drift detection method that triggers retraining only when necessary. In the second experiment, we analyze sampling methods to reduce the amount of data used to retrain models. Specifically, we compare fixed sized windows of recent data and state-of-the-art active learning methods that select those apps that help keep the training dataset small but diverse. Our experiments show that concept drift detection and sample selection mechanisms result in very efficient retraining strategies which can be successfully used to maintain the performance of the static Android malware state-of-the-art detectors in changing environments.

READ FULL TEXT
research
02/08/2023

Continuous Learning for Android Malware Detection

Machine learning methods can detect Android malware with very high accur...
research
08/09/2022

Robust Machine Learning for Malware Detection over Time

The presence and persistence of Android malware is an on-going threat th...
research
05/24/2022

Fast Furious: Modelling Malware Detection as Evolving Data Streams

Malware is a major threat to computer systems and imposes many challenge...
research
07/22/2018

Longitudinal Characterization and Sustainable Classification of Android Apps via SAD Profiles

Machine learning-based malware detection dominates current security defe...
research
08/03/2018

Stimulation and Detection of Android Repackaged Malware with Active Learning

Repackaging is a technique that has been increasingly adopted by authors...
research
10/29/2021

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Driven by the high profit, Portable Executable (PE) malware has been con...
research
10/14/2019

Comment on "AndrODet: An adaptive Android obfuscation detector"

We have identified a methodological problem in the empirical evaluation ...

Please sign up or login with your details

Forgot password? Click here to reset