Do Gradient-based Explanations Tell Anything About Adversarial Robustness to Android Malware?

05/04/2020
by   Marco Melis, et al.
3

Machine-learning algorithms trained on features extracted from static code analysis can successfully detect Android malware. However, these approaches can be evaded by sparse evasion attacks that produce adversarial malware samples in which only few features are modified. This can be achieved, e.g., by injecting a small set of fake permissions and system calls into the malicious application, without compromising its intrusive functionality. To improve adversarial robustness against such sparse attacks, learning algorithms should avoid providing decisions which only rely upon a small subset of discriminant features; otherwise, even manipulating some of them may easily allow evading detection. Previous work showed that classifiers which avoid overemphasizing few discriminant features tend to be more robust against sparse attacks, and have developed simple metrics to help identify and select more robust algorithms. In this work, we aim to investigate whether gradient-based attribution methods used to explain classifiers' decisions by identifying the most relevant features can also be used to this end. Our intuition is that a classifier providing more uniform, evener attributions should rely upon a larger set of features, instead of overemphasizing few of them, thus being more robust against sparse attacks. We empirically investigate the connection between gradient-based explanations and adversarial robustness on a case study conducted on Android malware detection, and show that, in some cases, there is a strong correlation between the distribution of such explanations and adversarial robustness. We conclude the paper by discussing how our findings may thus enable the development of more efficient mechanisms both to evaluate and to improve adversarial robustness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2018

Explaining Black-box Android Malware Detection

Machine-learning models have been recently used for detecting malicious ...
research
04/20/2019

Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach

The widespread adoption of smartphones dramatically increases the risk o...
research
10/22/2021

Improving Robustness of Malware Classifiers using Adversarial Strings Generated from Perturbed Latent Representations

In malware behavioral analysis, the list of accessed and created files v...
research
03/04/2022

Adversarial Patterns: Building Robust Android Malware Classifiers

Deep learning-based classifiers have substantially improved recognition ...
research
09/27/2021

GANG-MAM: GAN based enGine for Modifying Android Malware

Malware detectors based on machine learning are vulnerable to adversaria...
research
03/05/2022

DroidRL: Reinforcement Learning Driven Feature Selection for Android Malware Detection

Due to the completely open-source nature of Android, the exploitable vul...
research
04/06/2019

On Training Robust PDF Malware Classifiers

Although state-of-the-art PDF malware classifiers can be trained with al...

Please sign up or login with your details

Forgot password? Click here to reset