Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?

09/02/2022
by   Yue Liu, et al.
0

Machine learning (ML)-based Android malware detection has been one of the most popular research topics in the mobile security community. An increasing number of research studies have demonstrated that machine learning is an effective and promising approach for malware detection, and some works have even claimed that their proposed models could achieve 99% detection accuracy, leaving little room for further improvement. However, numerous prior studies have suggested that unrealistic experimental designs bring substantial biases, resulting in over-optimistic performance in malware detection. Unlike previous research that examined the detection performance of ML classifiers to locate the causes, this study employs Explainable AI (XAI) approaches to explore what ML-based models learned during the training process, inspecting and interpreting why ML-based malware classifiers perform so well under unrealistic experimental settings. We discover that temporal sample inconsistency in the training dataset brings over-optimistic classification performance (up to 99% F1 score and accuracy). Importantly, our results indicate that ML models classify malware based on temporal differences between malware and benign, rather than the actual malicious behaviors. Our evaluation also confirms the fact that unrealistic experimental designs lead to not only unrealistic detection performance but also poor reliability, posing a significant obstacle to real-world applications. These findings suggest that XAI approaches should be used to help practitioners/researchers better understand how do AI/ML models (i.e., malware detection) work – not just focusing on accuracy improvement.

READ FULL TEXT

page 1

page 6

page 9

research
04/24/2020

Why an Android App is Classified as Malware? Towards Malware Classification Interpretation

Machine learning (ML) based approach is considered as one of the most pr...
research
05/03/2023

Can Feature Engineering Help Quantum Machine Learning for Malware Detection?

With the increasing number and sophistication of malware attacks, malwar...
research
05/31/2022

Dataset Bias in Android Malware Detection

Researchers have proposed kinds of malware detection methods to solve th...
research
07/20/2018

TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time

Academic research on machine learning-based malware classification appea...
research
06/13/2022

On the impact of dataset size and class imbalance in evaluating machine-learning-based windows malware detection techniques

The purpose of this project was to collect and analyse data about the co...
research
12/21/2018

Towards resilient machine learning for ransomware detection

There has been a surge of interest in using machine learning (ML) to aut...

Please sign up or login with your details

Forgot password? Click here to reset