On labeling Android malware signatures using minhashing and further classification with Structural Equation Models
Multi-scanner Antivirus systems provide insightful information on the nature of a suspect application; however there is often a lack of consensus and consistency between different Anti-Virus engines. In this article, we analyze more than 250 thousand malware signatures generated by 61 different Anti-Virus engines after analyzing 82 thousand different Android malware applications. We identify 41 different malware classes grouped into three major categories, namely Adware, Harmful Threats and Unknown or Generic signatures. We further investigate the relationships between such 41 classes using community detection algorithms from graph theory to identify similarities between them; and we finally propose a Structure Equation Model to identify which Anti-Virus engines are more powerful at detecting each macro-category. As an application, we show how such models can help in identifying whether Unknown malware applications are more likely to be of Harmful or Adware type.
READ FULL TEXT