A3Ident: A Two-phased Approach to Identify the Leading Authors of Android Apps

08/31/2020
by   Wei Wang, et al.
0

Authorship identification is the process of identifying and classifying authors through given codes. Authorship identification can be used in a wide range of software domains, e.g., code authorship disputes, plagiarism detection, exposure of attackers' identity. Besides the inherent challenges from legacy software development, framework programming and crowdsourcing mode in Android raise the difficulties of authorship identification significantly. More specifically, widespread third party libraries and inherited components (e.g., classes, methods, and variables) dilute the primary code within the entire Android app and blur the boundaries of code written by different authors. However, prior research has not well addressed these challenges. To this end, we design a two-phased approach to attribute the primary code of an Android app to the specific developer. In the first phase, we put forward three types of strategies to identify the relationships between Java packages in an app, which consist of context, semantic and structural relationships. A package aggregation algorithm is developed to cluster all packages that are of high probability written by the same authors. In the second phase, we develop three types of features to capture authors' coding habits and code stylometry. Based on that, we generate fingerprints for an author from its developed Android apps and employ several machine learning algorithms for authorship classification. We evaluate our approach in three datasets that contain 15,666 apps from 257 distinct developers and achieve a 92.5 Additionally, we test it on 2,900 obfuscated apps and our approach can classify apps with an accuracy rate of 80.4

READ FULL TEXT
research
11/23/2022

Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem

The ability to identify the author responsible for a given software obje...
research
07/24/2023

A Dataset of Android Libraries

Android app developers extensively employ code reuse, integrating many t...
research
01/01/2020

Web APIs in Android through the Lens of Security

Web communication has become an indispensable characteristic of mobile a...
research
11/16/2021

NatiDroid: Cross-Language Android Permission Specification

The Android system manages access to sensitive APIs by permission enforc...
research
08/31/2023

JavaScript Dead Code Identification, Elimination, and Empirical Assessment

Web apps are built by using a combination of HTML, CSS, and JavaScript. ...
research
12/31/2020

PHP code smells in web apps: survival and anomalies

Context: Code smells are considered symptoms of poor design, leading to ...

Please sign up or login with your details

Forgot password? Click here to reset