Scalable Online Vetting of Android Apps for Measuring Declared SDK Versions and Their Consistency with API Calls

12/30/2019 ∙ by Daoyuan Wu, et al. ∙ Singapore Management University The Chinese University of Hong Kong 0

Android has been the most popular smartphone system with multiple platform versions active in the market. To manage the application's compatibility with one or more platform versions, Android allows apps to declare the supported platform SDK versions in their manifest files. In this paper, we conduct a systematic study of this modern software mechanism. Our objective is to measure the current practice of declared SDK versions (which we term as DSDK versions afterwards) in real apps, and the (in)consistency between DSDK versions and their host apps' API calls. To successfully analyze a modern dataset of 22,687 popular apps (with an average app size of 25MB), we design a scalable approach that operates on the Android bytecode level and employs a lightweight bytecode search for app analysis. This approach achieves a good performance suitable for online vetting in app markets, requiring only around 5 seconds to process an app on average. Besides shedding light on the characteristics of DSDK in the wild, our study quantitatively measures two side effects of inappropriate DSDK versions: (i) around 50 incur runtime crashes, but fortunately, only 11.3 6.0 and above; (ii) around 2 versions, are potentially exploitable by remote code execution, and a half of them invoke the vulnerable API via embedded third-party libraries. These results indicate the importance and difficulty of declaring correct DSDK, and our work can help developers fulfill this goal.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 22

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recent years have witnessed the extraordinary success of Android, a smartphone operating system owned by Google. At the end of 2013, Android became the best-selling phone and tablet OS. As of 2015, Android evolved into the largest installed base of all operating systems. Over these years, Android keeps leading the global smartphone market share at over 80% 24. Along with the fast-evolving Android, its fragmentation problem becomes more and more serious. Although new devices ship with the recent Android versions, there are still huge amounts of existing devices running old Android versions Android .

To better manage the application’s compatibility across multiple platform versions, Android allows apps to declare the supported platform SDK versions in their manifest files. We term these declared SDK versions as DSDK versions. The DSDK mechanism is a modern software mechanism with which, to the best of our knowledge, few systems are equipped until Android. Nevertheless, so far it receives little attention and few understandings are known about the effectiveness of the DSDK mechanism.

In this paper, we aim to conduct a systematic study on the Android DSDK mechanism. Specifically, our objective is to measure the current practice of DSDK versions in real apps, and the (in)consistency between DSDK versions and their host apps’ API calls. To make our measurement results representative, we select popular apps that have at least one million installs each on Google Play as the dataset. More specifically, we have collected a large-scale dataset with 22,687 popular apps (570.8GB in total, with an average app size of 25MB), which covers 90.2% of all such apps (both free and paid ones) available on Google Play. Furthermore, our study utilizes the latest Android API evolution and covers all 28 versions of Android SDKs or API levels111The latest Android version at the time of our writing is Android 9 (API level 28)..

After selecting the dataset and building the API-SDK mapping, we perform a systematic DSDK and API call analysis of each individual app. Our approach is designed to be scalable and robust so that it can be readily deployed by online app markets (e.g., Google Play) to timely notify developers of the DSDK inconsistency in their apps. Given this objective, dataflow-based analysis is not suitable because existing Android dataflow analyses (notably FlowDroid Arzt et al. (2014) and Amandroid Wei et al. (2014)) are expensive even when analyzing medium-sized apps, e.g., requiring 4 minutes for the 8MB Nextcloud app222https://f-droid.org/en/packages/com.nextcloud.client/ He et al. (2018). Moreover, they need to first transform or decompile Android app bytecode into an intermediate representation (e.g., Soot Jimple or Java bytecode), the process of which is not fully accurate Octeau et al. (2012) and often leaves some apps unanalyzable in many previous studies Yang et al. (2015) Avdiienko et al. (2015) Mariconti et al. (2017) Pan et al. (2017).

In our approach, we thus operate on the original Android bytecode level and employ a lightweight bytecode search for app analysis. Specifically, we retrieve DSDK versions and API calls directly from each app without decoding the manifest file and without transforming app bytecode, which enables robust processing of all 22,687 popular apps. We also handle multidex 18, a special Android bytecode mechanism often skipped by prior works but is common in modern apps — 5,008 apps in our dataset split their bytecodes into multiple files. With the correctly extracted app bytecodes, we then search these bytecode texts to obtain valid API calls that are not guarded by VERSION.SDK_INT checking (developers can use such if statements to invoke an API only in certain Android platforms) and also not in uninvoked third-party libraries. In this way, our approach preserves the scalability and is suitable for online vetting: the median and average time for analyzing an app in our dataset is only 4.75s and 5.39s, correspondingly.

Theoretically, our lightweight approach is less accurate than dataflow-based approaches. This is because we did not perform (the expensive) flow tracking, and false positives certainly appear. Fortunately, this limitation would not affect the real usage of our approach, since in our objective, the approach is used by online app markets for checking apps uploaded by developers. In other words, we can ask developers to manually check the inconsistency warnings in their apps. Moreover, the manual effort required in such checking is also limited — around 80% apps have fewer than 10 potentially inconsistent API calls each. This indicates that the number of inconsistency warnings per app reported by our bytecode search is well manageable for developers to perform a one-time manual check. It is worth noting that this paper is not for bug detection; instead, we aim for a comprehensive study on the current DSDK practice and its potential impacts. By employing a lightweight yet conservative approach, we can maximize the coverage of valid code and thus minimize false negatives (the dataflow tracking is sometimes too tight and could fail to process complex implicit flows, e.g., as high as 13 different kinds of implicit flows missed in FlowDroid according to a systematic assessment recently Bonett et al. (2018)).

In a nutshell, our study sheds light on the current DSDK practice by app developers and quantitatively measures two side effects caused by the inconsistency between DSDK versions (configured by the app developers in the manifest file) and API calls (made by the app during its execution). Specifically, the compatibility effect occurs when a minimum DSDK version is set too low that certain APIs do not even exist in the corresponding lower versions of Android platforms. The consequence of such compatibility effect can cause runtime crashes. Additionally, the security effect could also happen when a target DSDK version is outdated (i.e., a lower version is used despite device actually running on later versions of Android), causing that a vulnerable API is still rendered by the underlying system when the app runs on higher versions of Android. Next, we present our three sets of measurement results on DSDK versions and their inconsistency with API calls. Note that due to the conservative nature of our approach, the measurement results reported in this paper represent a upper bound of all potential DSDK problems (under the condition that common analysis difficulties, such as native code, are not considered).

Firstly, our measurement reveals some interesting characteristics of declared SDK versions in the wild. Specifically, nearly all apps define the minSdkVersion attribute, but 4.76% apps still do not claim the targetSdkVersion attribute (in our dataset obtained in 2018). Fortunately, this percentage has significantly dropped from 16.54% in 2015, which indicates that DSDK attributes nowadays are more widely adopted in modern apps. We further find that the minimal platform version most apps support nowadays is Android 4.1, whereas the most popular targeted platform version is Android 8.0. The median version difference between targetSdkVersion and minSdkVersion also increases from 8 in our last analysis in 2015 to 9 currently in the 2018 dataset.

Secondly, in terms of compatibility inconsistency, we first find that around 50% apps under-set the minSdkVersion value, causing them to crash when running on lower versions of Android platforms. Fortunately, only 11.3% apps could crash on Android 6.0 and above. We also show that by employing bytecode search for SDK_INT checking, our approach can reduce 17.3% false positives of compatibility inconsistency results. A detailed analysis of Android APIs that incur compatibility inconsistency further reveals that some API classes, such as view, webkit, and system manager related classes, are commonly misused.

Thirdly, our analysis of security inconsistency shows that around 2% apps still set an outdated targetSdkVersion attribute when a common WebView API is vulnerable, making them exploitable by remote code execution. In particular, around a half of these vulnerable apps invoke the vulnerable addJavascriptInterface() API call because of their embedded third-party libraries. Moreover, our bytecode search of the addJavascriptInterface() invocation also helps reduce 12.2% false positives.

To summarize, we highlight the contributions of this paper as follows:

  • (New problem) To the best of our knowledge, we are the first to conduct a systematic study on DSDK, a modern software mechanism that allows apps to declare the supported platform SDK versions. We also give the first demystification of the DSDK mechanism and its two side effects on compatibility and security. In particular, our preliminary conference version of this work Wu et al. (2017b) has motivated several recent follow-up works Li et al. (2018) He et al. (2018) on bug detection.

  • (Novel approach) We propose a robust and scalable approach that operates directly on the original bytecode level and leverages lightweight bytecode search to timely notify developers the DSDK inconsistency in their apps. The evaluation using 22,687 popular apps (with an average app size as large as 25MB) shows that our approach achieves a good performance suitable for online app vetting, requiring only around 5 seconds to process an app on average.

  • (New findings) Our measurement study obtains three major new findings, including (i) 4.76% apps still do not claim the targetSdkVersion attribute, although this percentage has significantly dropped from 2015 to 2018, (ii) around 50% apps under-set the minimum DSDK versions and could incur runtime crashes, but fortunately, only 11.3% apps could crash on Android 6.0 and above, and (iii) around 2% apps, due to under-claiming the targeted DSDK versions, are potentially exploitable by remote code execution, and a half of them actually invoke the vulnerable API via embedded third-party libraries.

In this journal article, we extend our preliminary conference version Wu et al. (2017b) from the following perspectives: (1) We integrate a lightweight bytecode search into our approach so that it can be deployed by online app markets to timely notify developers of the DSDK inconsistency in their apps. We also add the support for multidex-based apps and enhance the detection of uninvoked third-party libraries. (2) We evolve our dataset from an old set of 23,125 random apps in 2015 to a recent set of 22,687 popular apps in 2018. We also find a lightweight way to build the latest API-SDK mapping. (3) By running experiments using the improved approach and dataset, we obtain more representative results and compare some of our new findings with the previous ones.

2 Demystifying Declared SDK Versions and Their Two Side Effects

In this section, we first demystify declared platform SDK versions in Android apps, and then explain their two side effects if inappropriate DSDK versions are used. Note that DSDK is different from the typical compilation SDK, which is only for compiling apps while DSDK is mainly for interpreting run-time API behaviors.

2.1 Declared SDK Versions in Android Apps

Listing 1 illustrates how to declare supported platform SDK versions in Android apps by defining the <uses-sdk> element 45 in apps’ manifest files (i.e., AndroidManifest.xml 44). These DSDK versions are for the runtime Android system to check apps’ compatibility, which is different from the compiling-time SDK for compiling source codes. The value of each DSDK version is an integer, which represents the API level of the corresponding SDK. For example, if a developer wants to declare Android SDK version 5.0, she can set its value to 21. Since each API level has a precise mapping of the corresponding SDK version Android , we do not use another term, declared API level, to represent the same meaning of DSDK throughout this paper.

<uses-sdk  android: minSdkVersion="integer"
         android: targetSdkVersion="integer"
         android: maxSdkVersion="integer" />
Listing 1: The syntax for declaring platform SDK versions in Android apps.

We explain the three DSDK attributes as follows:

  • The minSdkVersion integer specifies the minimum platform API level required for an app to run. The Android system refuses to install an app if its minSdkVersion value is greater than the system’s API level. Note that if an app does not declare this attribute, the system by default assigns the value of “1”, which means that the app can be installed in all versions of Android.

  • The targetSdkVersion integer designates the platform API level that an app targets at. An important implication of this attribute is that Android adopts backward-compatible API behaviors of the declared target SDK version, even when an app is running on a higher version of the Android platform. Android makes such compromised design because it aims to guarantee the same app behaviors as expected by developers, even when apps run on newer platforms. It is worth noting that if this attribute is not set, the default minSdkVersion is used.

  • The maxSdkVersion integer specifies the maximum platform API level on which an app can run. However, this attribute is not recommended and already deprecated since Android 2.1 (API level 7). Modern Android no longer checks or enforces this attribute during the app installation or re-validation. The only effect is that Google Play continues to use this attribute as a filter when it presents users a list of applications available for downloading. Note that if this attribute is not set, it implies no restriction on the maximum platform API level.

2.2 Two Side Effects of Inappropriate DSDK Versions

Fig. 1 illustrates two side effects of inappropriate DSDK versions. We first explain the symbols used, and then describe the two side effects. As shown in Fig. 1, we can obtain , , and from an app manifest file. Based on the API calls of an app, we can calculate the minimum and maximum API levels it requires, i.e., and . Eventually, the app will be deployed to a range of Android platforms between and .

center

Figure 1: Illustrating the two side effects of inappropriate DSDK versions.

2.2.1 Side Effect I: Causing Runtime Crashes

The blue part of Fig. 1 shows two scenarios in which inappropriate DSDK versions could cause compatibility-related inconsistency. The first scenario is , which means a new API is introduced after the . Consequently, when an app runs on Android platforms between and (marked as the block 1 in Fig. 1), it will crash. We verified this case by using VpnService class’s addDisallowedApplication() API, which was introduced on Android 5.0 at API level 21. We invoked this API in the MopEye app Wu et al. (2017a) and ran it on an Android 4.4 device. When the app executed the addDisallowedApplication() API call, it crashed with the java.lang.NoSuchMethodError exception.

The second scenario is , which means an old API is removed at the . Although it looks like the app would crash when it runs on Android platforms between and , it turns out that Google intentionally keeps the forward compatibility (by keeping those removed APIs in the framework as hidden APIs) so that developers have no concern in over-setting maxSdkVersion. As a result, this scenario would not cause runtime method availability errors. Therefore, we measure only the first scenario of compatibility inconsistency that can cause runtime crashes in this paper.

2.2.2 Side Effect II: Making Apps Vulnerable

The red part of Fig. 1 shows the scenario in which inappropriate DSDK versions cause failure for the app to be patched. Suppose that an app calls an API whose implementation is vulnerable at , even when the app runs on an updated Android system (with API level ). In this case, Android still exhibits the compatibility behaviors, i.e., the vulnerable implementation of the API at in this case.

center Vulnerable APIs/Components Patched SDKs (Android) Changed Behavior file:// scheme in WebView (4.1+) Fix flawed same-origin policy Wu and Chang (2014) Content Provider component (4.2+) Disable the default exposure 8 addJavascriptInterface() (4.2+) Stop Java reflection for RCE 15 PreferenceActivity class (4.4+) Add isValidFragment() for apps to prevent Fragment Hijacking 23 javascript: in WebView (4.4+) JavaScript URLs are executed in a separate WebView context Mutchler et al. (2016) Context.bindService() (5.0+) Do not accept Implicit Intents 42

Table 1: Vulnerable APIs or components on Android and their patched versions.

Table 1 summarizes previously reported vulnerable APIs or components on Android and their patched versions. In this paper, we choose to particularly measure the vulnerable addJavascriptInterface() API for two reasons. First, it has a clear API pattern for inconsistency measurement, while other cases in Table 1 involve multiple component-level factors that could cause a vulnerability. Second, the addJavascriptInterface() API gives rise to the most serious security issue Drake (2014). By exploiting this API, attackers are able to inject malicious code, which can cause remote code execution (e.g., stealing sensitive information from a victim app or SD card). Google later fixed this weakness on Android 4.2 and above. However, if an app sets the targetSdkVersion to be lower than 17 and calls this API, the system will still render the vulnerable API behavior even when running on Android 4.2+. Such vulnerable app examples are available at https://sites.google.com/site/androidrce/.

3 Methodology

To understand how DSDK versions are used in the wild and the pervasiveness of the two side effects in real apps, we propose an automatic approach for a systematic measurement. In this section, we first present an overview of our methodology, and then its two main analysis phases.

3.1 Overview

The major design goal of our approach is to help the app markets timely notify developers the DSDK inconsistency in their apps. Fig. 2 illustrates its overall design, where the app analysis part is conducted in the online phase. Since our app analysis requires the API-SDK mapping as an input (for calculating API levels of all valid API calls in an app), we further conduct Android API document analysis to build a mapping between each Android API and their corresponding SDK versions (or API levels). As this step is performed only once, we include it in the offline phase.

The major part of our approach is designed for the online vetting of apps. Specifically, whenever developers upload a new or updated app to app markets, we first unzip this app to obtain its bytecode DEX file(s). We then launch manifest analysis to robustly retrieve an app’s declared SDK versions. For bytecode analysis, our novelty is to propose a lightweight bytecode search, instead of heavyweight dataflow analysis, to extract valid API calls. Finally, we leverage the API-SDK mapping to calculate the range of the corresponding API levels from API calls, and compare them with the declared SDK versions. The output is the (in)consistency results between declared SDK versions and API calls. It is worth noting that multiple-apk analysis Wu et al. (2017b) is no longer needed in our online analysis, because app markets control all versions of APKs and multiple-apk mechanism is largely used for different hardware configuration 37.

center

Figure 2: The overview of our methodology.

3.2 Offline Phase: API Document Analysis

In this subsection, we present our offline phase in detail, including both methodology and results of API document analysis.

Figure 4: The distribution of removed Android APIs across different SDK versions.
Figure 4: The distribution of removed Android APIs across different SDK versions.
Figure 5: The distribution of deprecated Android APIs across different SDK versions.
Figure 3: The distribution of added Android APIs across different SDK versions.

Building the API-SDK mapping. There are two potential approaches for building the API-SDK mapping. One is to analyze Android API documents by parsing a SDK document called api-versions.xml. A previous API study McDonnell et al. (2013) and our preliminary study Wu et al. (2017b) leveraged this approach to obtain initial and added APIs, but they did not cover removed and deprecated APIs because of no such information in the api-versions.xml file. They thus also needed to analyze the HTML files in the api_diff directory, which is unfortunately error-prone Wu et al. (2017b). The other approach is to directly retrieve the API-SDK mapping from each SDK jar file. However, different SDK releases under the same API level may have some API differences, and there are over 600 releases333See tags in https://android.googlesource.com/platform/frameworks/base.git/+refs. for 28 API levels at the time of our writing. As a result, conflicted API mappings could be recorded, e.g., marking the Gravity.getAbsoluteGravity API that was removed in SDK version 16 and then added back in version 17 Li et al. (2018).

Fortunately, we find that the first approach now covers all kinds of Android APIs. Specifically, the latest api-versions.xml file released in Android 9 SDK records all added, removed, and deprecated APIs. Therefore, we can simply parse this file to obtain a complete API-SDK mapping.

Document analysis results. With the accurate API-SDK mapping, we are able to present a comprehensive evolution of Android APIs across different SDK versions. Fig. 55, and 5 plot the distribution of added, removed, and deprecated Android APIs from API level 2 to the very recent API level 29, respectively. Overall, we find that 26,466 (67.8%) out of a total of 39,034 Android APIs are changed. This result indicates that Android APIs evolve dramatically during the whole evolution.

The biggest change in the Android API evolution is to add 23,542 APIs since level 2, as shown in Fig. 5. Specifically, Android 7.0 (API level 24) changed most, with 3,627 new APIs introduced. Android 8.0 (API level 26) and Android 5.0 (API level 21) also introduce a significant number of new APIs, with 3,218 and 2,581 APIs added, respectively. Other versions of platforms with a large number of added APIs are Android 3.0 (API level 11), Android 6.0 (API level 23), and Android 9.0 (API level 28). These new Android APIs bring a huge risk of compatibility inconsistency, causing runtime crashes on lower versions of Android. In particular, we notice that over half (13,306, 56.5%) of all added APIs are introduced since Android 5.0, giving them a higher chance of causing compatibility inconsistency than the rest of APIs added.

In contrast, only 4,830 (18.2%) APIs involve the removal change (i.e., removed or deprecated; some of them are also introduced after API level 2), with 3,671 APIs deprecated and 2,902 APIs finally removed. According to Fig. 5 and 5, the biggest removal happens in Android 5.1 and 6.0 (API level 22 and 23), with 1,359 APIs deprecated and 1,307 APIs removed afterwards. Moreover, Android 9.0 (API level 28) deprecates 507 APIs and its next version (API level 29) removes 504 of them, which suggests that Google plans to remove a large number of APIs in the release of Android 9.0. Additionally, although Android 4.1 (API level 16) deprecated 559 APIs, only 222 APIs were removed in the subsequent Android 4.2 and 4.3.

To sum up, 23,542 (60.3%) out of all the 39,034 Android APIs are introduced at a SDK version other than the initial Android SDK version (i.e., API level 1), which brings a high risk for developers to under-set the minSdkVersion attribute. On the other hand, much fewer Android APIs, 7.4% of all APIs, are mapped to a range of SDK versions that have an upper limit (i.e., deleted in recent SDK versions).

3.3 Online Phase: Android App Analysis

In this subsection, we present three major modules in the online analysis phase, namely manifest analysis, bytecode search, and consistency comparison in Fig. 2.

3.3.1 Retrieving Dsdk Versions via Manifest Analysis

To robustly retrieve DSDK versions from all apps, we propose a new manifest analysis method that leverages aapt (Android Asset Packaging Tool) 1 to retrieve DSDK directly from each app without extracting and decoding the manifest file. This method is more robust than the traditional apktool-based manifest extraction 7 that requires to extract and decode the manifest into a plaintext file. Indeed, our aapt-based approach can successfully analyze all 22,687 apps, whereas a previous work Wu et al. (2014) showed that apktool failed six times in the analysis of just 1K apps. Specifically, we utilize the dump baging command in aapt to extract the DSDK versions. In this way, we can directly retrieve the correct DSDK versions without analyzing raw manifest files. Therefore, even an app contains old or unreferenced manifest files, it would not affect our analysis.

In the course of implementation and evaluation, we observed and handled two kinds of special cases. First, some apps define minSdkVersion multiple times, for which we only extract the first value. Second, we apply the default rules (see Sec. 2.1) for apps without minSdkVersion and targetSdkVersion defined. More specifically, we set the value of minSdkVersion to 1 if it is not defined, and set the value of targetSdkVersion (if it is not defined) using the minSdkVersion value.

Besides retrieving DSDK, our manifest analysis also parses all components registered in the manifest to generate a list of valid components and their root (Java) class names. This information will be used in the app analysis module to exclude uninvoked third-party libraries. Specifically, we execute the dump xmltree command in aapt to output all component information. In the process of parsing these components, we also generate their root class names according to this rule: if the component class does not overlap with the app package or <application> name (i.e., this class could be from a third-party library), we record the entire class name as the root class; otherwise, only the leading two or three name portions are treated as the root class.

3.3.2 Extracting Valid API Calls via Bytecode Search

The main module in our app analysis is to extract valid API calls. A valid API call should not be guarded by the VERSION.SDK_INT checking (a mechanism developers can use to invoke an API only in certain Android platforms). It also should not be in uninvoked third-party libraries that are essentially dead code. To guarantee the scalability for online vetting, we propose a lightweight bytecode search, instead of dataflow-based approaches, for app analysis, because existing Android dataflow analyses, notably FlowDroid Arzt et al. (2014) and Amandroid Wei et al. (2014), are expensive even when analyzing medium-sized apps, e.g., requiring 4 minutes for just an app of size 8MB He et al. (2018).

Moreover, we operate on the original Android bytecode level without decompiling app bytecodes for minimizing false negatives. This is because the process of transforming or decompiling Android app bytecode into an intermediate representation (usually Java bytecode) is not fully accurate Octeau et al. (2012). As a result, many previous studies Yang et al. (2015) Avdiienko et al. (2015) Mariconti et al. (2017) Pan et al. (2017) often failed to handle some apps, causing false negatives in their analysis. In contrast, by directly analyzing app bytecodes, we robustly process all 22,687 popular apps in our dataset. Specifically, we leverage the dexdump tool 16 to translate compressed bytecodes into plain bytecode texts (similar to using objdump to generate assembly code texts), upon which we can then launch bytecode search to extract valid API calls. Note that dexdump, as an official Android SDK tool, is very robust, and it does not generate intermediate representation. We also dump (multiple) app bytecodes into a (combined) plaintext Wu et al. (2019) to handle multidex 18, a special bytecode format often skipped by prior works but indeed common in modern apps — 5,008 apps in our dataset split their bytecodes into multiple files. Hence, we avoid another common source of false negatives.

In the rest of this subsection, we first introduce the basic bytecode search mechanism before describing our bytecode search of VERSION.SDK_INT checking and vulnerable API calls in details. We then explain how we exclude uninvoked third-party libraries during the search process.

center

Figure 6: A high-level overview of our bytecode search mechanism.

The basic bytecode search mechanism. Fig. 6 shows a high-level overview of our bytecode search mechanism. The bytecode text outputted by dexdump is a sequence of code statements, hierarchically organized by different class and method bodies. In Fig. 6, we show six method bodies (from method A to method F), where their corresponding class bodies are omitted for simplicity. As illustrated in the figure, our bytecode search scans these methods to locate inconsistent API calls (e.g., call site i1 and i2 in method A and C, respectively) and vulnerable API calls (e.g., call site v1 in method F). We can perform further search to determine in which class an interested method is invoked, e.g., Fig. 6 shows that method F (containing vulnerable API call v1) is called by another method D. Besides the method search, we can also launch if statement search to locate conditional checking, e.g., statement c1 that surrounds call site i2 in method C.

Searching VERSION.SDK_INT checking. As mentioned earlier in this subsection, developers can use if statements with VERSION.SDK_INT checking to invoke an API only in certain Android platforms, thus avoiding the inconsistency problem. Listing 2 shows an example of VERSION.SDK_INT checking, which invokes the addDisallowedApplication() API (introduced since API level 21) only on Android 5.0 and above. To avoid such false positives, our approach must handle the VERSION.SDK_INT checking.

1VpnService.Builder builder = new VpnService.Builder();
2if (VERSION.SDK_INT >= VERSION_CODES.LOLLIPOP) {
3    builder.addDisallowedApplication(Constant.PkgName);
4}
Listing 2: An example of VERSION.SDK_INT checking.

Our strategy is to make both API call and VERSION.SDK_INT checking search and see whether the two search results overlap in the same method. For example, in Fig. 6, our bytecode search locates both checking statement c1 and API call i2 in method C. Since these two search results overlap and API call i2 is invoked below checking statement c1, we are thus confident that this API call has been guarded with a corresponding VERSION.SDK_INT checking. Moreover, according to a recent study He et al. (2018), 88.65% of the DSDK checking usages directly compare the VERSION.SDK_INT variable with a constant Android version number, which makes our bytecode search strategy appropriate.

Searching vulnerable API calls. For a vulnerable API call, we further employ bytecode search to determine whether it is initialized by app’s own code or library code. This is particularly important for the vulnerable API studied in this paper, namely addJavascriptInterface(), because a previous study has shown that over 47% of top 40 ad libraries create their Javascript Interfaces 27. Specifically, after locating vulnerable API call v1 in method F, we further search the invocation(s) of method F to check its origin class.

Excluding uninvoked third-party libraries. An important issue during our bytecode search is to exclude uninvoked third-party libraries. To tackle this problem, we cannot simply employ library detection (e.g., LibScout Backes et al. (2016) and LibD Li et al. (2017)) to exclude all

libraries, because this approach also ignores valid library code. On the other hand, the expensive dataflow-based analysis does not satisfy the objective of online vetting in app markets. Therefore, we choose to first spot all potentially inconsistent API candidates via a lightweight approach and then leverage developers to manually reduce the false positives among them. Specifically, our lightweight approach combines both heuristics-based component analysis and API-based bytecode search. First, for the component analysis,

we consider all components registered in the manifest, including those from third-party libraries. As mentioned in Sec. 3.3.1, we generate root classes for all registered components via manifest analysis. A class code that does not appear in any root class is thus from an uninvoked third-party library or dead code. But even for a valid third-party library, only its registered components will be analyzed because not all code in a library will be invoked by the main app. Furthermore, when a candidate API call being reported, we launch bytecode search to double check its invocations.

3.3.3 Calculating API Levels and Comparing Their Consistency with DSDKs

With the extracted API calls, we use the API-SDK mapping to compute the range of their corresponding API levels (i.e., from minLevel to maxLevel, as explained in Sec. 2.2). The minLevel of an app is the maximum of all its valid API calls’ corresponding minLevel values (i.e., all correspondingly added SDK versions), while the maxLevel of an app is the minimum of all valid API calls’ corresponding maxLevel values (i.e., all correspondingly removed SDK versions). If an API is never removed, we set a large flag value (e.g., 100,000) to represent its maxLevel value.

We then compare the extracted DSDK values with the calculated API levels to obtain the following two kinds of inconsistency (as previously mentioned in Sec. 2.2):

  • : the minSdkVersion is set too low and the app would crash when it runs on platform versions between minSdkVersion and minLevel.

  • : the targetSdkVersion is set too low and the app could be updated to the version of maxLevel. If the maxLevel is infinite, the targetSdkVersion could be adjusted to the latest Android version.

4 Evaluation

Our evaluation aims to answer the following five research questions:

RQ1

What are the current DSDK characteristics in popular real-world apps?

RQ2

How pervasive is the compatibility-related inconsistency in real-world apps?

RQ3

How pervasive is the security-related inconsistency is in real-world apps?

RQ4

How scalable is our inconsistency detection approach?

RQ5

What is the updatablity of the buggy apps? Are they still being maintained?

We choose popular real-world apps,

instead of randomly selected apps or open-source apps, for evaluation, because they are most likely installed by regular users. Hence, the obtained measurement results can reflect the

DSDK practice in the wild. In this section, we first describe how we collect such a large dataset in Sec. 4.1. Based on this dataset, we then answer the five research questions from Sec. 4.2 to Sec. 4.6.

4.1 Dataset

We collect popular apps on Google Play via the AndroZoo repository Allix et al. (2016), which contains a total of 3,699,731 unique444An app is unique if its package name, instead of SHA1/256, is different from other apps. Google Play apps at the time of our crawling on 11 November 2018. However, AndroZoo does not provide the app install information, which is required to determine the popularity of each app. To quickly locate popular apps, we leverage the top app lists available at https://www.androidrank.org. Specifically, we crawled top 1,000 app in each Google Play category (49 categories in total, including 17 different game sub-categories), and recorded the package names of apps with over one million installs. This allows us to obtain a list of 25,144 popular apps, 22,687 (the rest are either paid apps or not indexed by AndroZoo) of which are available on AndroZoo. We thus downloaded these 22,687 apps as our dataset.

(a) 32 non-game app categories.
(b) 17 game app categories.
Figure 7: Bar charts of the distribution of popular apps across different categories.

To understand these popular apps’ distribution across different app categories, we plot bar charts in Fig. 7 that cover both 32 non-game app categories and 17 game sub-categories. In particular, 17 game sub-categories contribute to a total of 10,695 popular apps, which indicates that game apps are commonly installed by real-world Android users. According to Fig. 7, app categories like “Personalization”, “Tools”, “Photography”, “Entertainment”, and “Music” also produce a large number of popular apps, almost 1K popular apps per category. We notice that daily-used categories, such as “Communication” and “Social”, however, do not generate an equivalent number of popular apps, with only 600 to 700 popular apps. This is because in these categories, several very popular apps, e.g., WeChat and Facebook, dominate a large portion of the market share. Lastly, it is also reasonable for some unpopular categories, such as “Medical” and “Libraries & Demo”, to have a limited number of popular apps.

It is also important to measure the distribution of app size in our dataset. Fig. 8

plots the CDF (cumulative distribution function) of the APK file size of each app in our dataset. We can see that over 40% apps have a size larger than 20MB, and over 20% apps are even larger than 40MB each. This indicates that a significant portion of modern apps are no longer small. Indeed, the average app size in our dataset is 25MB, much larger than the size of apps used in several prior dataflow analysis studies (e.g., apps below 5MB were evaluated in AppContext 

Yang et al. (2015), and the maximum app size in IctApiFinder He et al. (2018) is 8MB). Therefore, scalability is a key design objective for our approach, and we will evaluate it extensively in Sec. 4.5.

center

Figure 8: CDF plot of the APK file size of each app in our dataset.

4.2 RQ1: Characteristics of Declared SDK Versions in the Wild

In this section, we report a total of four findings regarding RQ1. We also compare these new findings with our previous results in Wu et al. (2017b), which measured a dataset of 22.7K apps crawled in 2015.

Finding 1-1: Nearly all apps define the minSdkVersion attribute, but there are still 4.76% apps not claiming the targetSdkVersion attribute, although this percentage has significantly dropped compared to our prior analysis in 2015. Table 2 shows the number and percentage of non-defined DSDK attributes in our dataset. We can see that nearly all apps have defined the minSdkVersion attribute while nearly no apps define the maxSdkVersion attribute. This result is good because, as we described in Section 2.1, defining minSdkVersion is necessary while maxSdkVersion is not. However, we also notice that there are still 1,079 (4.76%) apps not claiming the targetSdkVersion attribute, which causes their targetSdkVersion values be set to the corresponding minSdkVersion values by default.

Fortunately, the percentage of non-defined targetSdkVersion has significantly dropped compared to our prior analysis in 2015, from 16.54% to 4.76%. One important factor is the popularity of Android Studio in recent years, which has become the de-facto IDE (integrated development environment) for Android app development. Since Android Studio by default sets and enforces the minSdkVersion and targetSdkVersion attributes, the percentage of non-defined targetSdkVersion naturally drops and we expect that this percentage will further decrease with more apps getting updated.

center # Non-defined % Non-defined minSdkVersion 5 0.02% targetSdkVersion 1,079 4.76% maxSdkVersion 22,623 99.72%

Table 2: The number and percentage of non-defined DSDK attributes in our dataset.

Finding 1-2: Some targetSdkVersion attributes are set to outlier values. We find that a total of 45 apps in our dataset declare their targetSdkVersion

attributes as outlier values, a finding close to that in our prior analysis in 2015 when we encountered 55 such cases. There are two classes of outlier values. The first is that

targetSdkVersion is set to an API level not in the range of released SDKs. At the time of our analysis, the valid API levels are from 1 to 28 (Android 9.0). However, 12 apps set their targetSdkVersion to larger than 28, namely 29, 30, and 31. In our prior analysis Wu et al. (2017b), there was one app even with its targetSdkVersion value set to 10000. This isprobably because their developers want to always target at the latest Android SDK.

The other class of outliers is that the targetSdkVersion value is set to a value lower than the minSdkVersion value. Normally, targetSdkVersion should be greater than or equal to minSdkVersion, but 33 apps have negative values. This number is almost the same as that in our prior analysis in 2015 (34 apps at that time). In particular, there was one app (com.leftover.CoinDozer) which defines its targetSdkVersion as 0, although its minSdkVersion value is 8. We believe that this class of outliers is due to developers’ mistakes in declaring the DSDK attributes.

Finding 1-3: The minimal platform version most apps support is Android 4.1, whereas the most targeted platform version is Android 8.0. This has dramatically evolved since our last analysis in 2015. We first study the distribution of minSdkVersion. According to Fig. 9, the majority (89%) of apps have minSdkVersion lower than or equal to level 16 (Android 4.1), which means that they can run on nearly all (99.5%) Android devices in the market nowadays Android . Specifically, the minimal platform version most apps support is Android 4.1 (level 16), while that in our last analysis in 2015 was only Android 2.3 (level 9). However, Android 2.3 still ranks in the second place, with 3,614 apps’ minSdkVersion targeted at. Besides Android 4.1 and 2.3, two Android 4.0.x (level 14 and 15) platform versions are also commonly defined as apps’ minSdkVersion.

center

Figure 9: Distribution of minSdkVersion.

On the other hand, Fig. 10 plots the distribution of targetSdkVersion. We can see that 80% apps set their targetSdkVersion values to larger than or equal to level 19 (Android 4.4). In particular, the two most targeted platform versions are the most recent Android 8.0 (level 26) and 8.1 (level 27), while those in our last analysis in 2015 were Android 4.4 and 5.0. This suggests that modern apps keep better pace with the evolution of the Android operating system. Besides Android 8, Android 6.0 (level 23) and 4.4 (level 19) still hold a significant portion of apps with the corresponding targetSdkVersion setting. Moreover, Android 7.0.x (level 24 and 25) and Android 5.0.x (level 21 and 22) also attract considerable apps being targeted at.

center

Figure 10: Distribution of targetSdkVersion.

Finding 1-4: The median version difference between targetSdkVersion and minSdkVersion is 9, while that of our last analysis was 8. This 11% increase indicates that Android apps nowadays need to support more Android platforms. We define a new metric called lagSdkVersion to measure the version difference between targetSdkVersion and minSdkVersion, as shown in Equation 1.

(1)

After removing the negative lagSdkVersion values (i.e., outliers mentioned in Finding 1-2), we draw the CDF plot of lagSdkVersion in Fig.11. We first find that the median value of lagSdkVersion in our dataset is 9, while that of our last analysis in 2015 was 8. It indicates that Android apps nowadays need to support more Android platforms. This conclusion can be further supported through measuring the percentage of apps that have a lagSdkVersion value greater than 12. Compared to our prior analysis, this percentage has increased from 5% to 20%, which clearly shows that more and more apps nowadays support a wide range of Android platforms. On the other hand, the percentage of apps that have the same value for targetSdkVersion and minSdkVersion has also dropped from 20% in 2015 to 6.4% in 2018.

center

Figure 11: CDF plot of lagSdkVersion.

4.3 RQ2: Inconsistency Results with Compatibility Effect

In this section, we report three important findings regarding RQ2. Besides presenting compatibility results as the major finding, we also summarize the reduced false positives by our bytecode search as compared to the prior conference version, and show in detail the newly added API classes are common sources of compatibility inconsistency.

Finding 2-1: Around 50% apps under-set the minSdkVersion value, causing them could crash when running on lower versions of Android platforms. Fortunately, only 11.3% apps could crash on Android 6.0 and above. As explained in Sec. 3.3.3, the compatibility inconsistency happens if minSdkVersion is less than minLevel. In our experiments, we therefore count the number of API calls that have higher API level than minSdkVersion in each app, and denote it by minOverNum. The higher value an app’s minOverNum is, the more likely that this app has the compatibility inconsistency.

Fig. 12 shows the CDF plot of minOverNum in each app. We find that 14,363 (63.3%) apps have at least one API call that has higher API level than the corresponding minSdkVersion. To further increase the confidence of our analysis, we count that 8,019 (35.4%) apps invoke over five different API calls with higher API levels than corresponding minSdkVersion

. Therefore, we estimate that around 50% apps could crash when running on lower versions of Android platforms because they under-set the

minSdkVersion value. Fortunately, we find that the number of inconsistency warnings per app reported by our bytecode search is well manageable for developers — 77.8% of the 14,363 apps have fewer than 10 different inconsistent API calls. It is thus not difficult for developers to perform a one-time manual check.

center

Figure 12: CDF plot of minOverNum in each app.

center

Figure 13: Bar chart of the number of apps in each minLevel.

Fortunately, apps with compatibility inconsistency issues could crash only on certain Android platforms. More specifically, they could crash only on versions of platforms between minSdkVersion and minLevel, as illustrated earlier in Sec. 2.2. Therefore, it is necessary to study on which Android platforms those buggy apps could crash, because nowadays some lower versions of Android hold a limited market share, e.g., only 11% for Android below 5.0 Android . As a result, even if some apps are buggy with compatibility inconsistency, they cannot trigger the crash on user phones equipped with recent versions of Android.

Since minLevel is the indicator for maximum versions of Android platforms a buggy app could crash on, we plot a bar chart of minLevel in Fig. 13 for 14,363 app that are detected with compatibility inconsistency. We can see that only 2,566 (11.3% of 22,687) apps could crash on Android 6.0 and above (via counting apps with minLevel larger than 23). In other words, the majority (11,797 out of 14,363) of buggy apps cannot exhibit their incompatibility bugs on Android devices that are with over 70% market share in January 2019. Furthermore, 8,990 out of 14,363 apps could crash only on Android below 5.0, which significantly limits the consequences of their incompatibility issues.

Finding 2-2: We find that by employing bytecode search for SDK_INT checking, our approach can reduce 17.3% false positives of compatibility inconsistency results. As mentioned in Sec. 3.3.2, a false positive of compatibility inconsistency could appear if an API call guarded with SDK_INT checking is not detected. Here we measure the number of such false positives that could be excluded by the bytecode search. We find that our search of SDK_INT checking avoids 3,003 apps from being mistakenly marked with compatibility inconsistency. Since there are 14,363 apps (i.e., true positives) that could crash when running on lower versions of Android platforms, the percentage of reduced false positives due to bytecode search is 17.3%.

center

Figure 14: Bar chart of the top 20 Android API classes (with “android.” prefix omitted) that incur compatibility inconsistency in our dataset.

Finding 2-3: A detailed analysis of Android APIs that incur compatibility inconsistency reveals that some API classes, such as view, webkit, and system manager related classes, are commonly misused. We further try to understand the common sources of compatibility inconsistency by analyzing the newly added Android APIs that incur compatibility inconsistency in our dataset. We find that 6,454 (27.4% of all 23,542) newly added APIs from 1,138 unique classes cause compatibility inconsistency in at least one app in our dataset. In particular, 232 (20.4%) API classes affect more than 100 different apps each, making them the common sources of compatibility inconsistency. Fortunately, half of API classes only affect fewer than 10 apps each, which suggests that only some portions of API classes prone to misuses.

We thus take a closer look at the top 20 Android API classes that cause compatibility inconsistency. As shown in Fig. 14, all of these classes affect over 1K apps each. In particular, the JobService class that was introduced in Android 5.0 (level 21) alone could cause compatibility inconsistency in around 5K apps. Other commonly misused API classes include those related to view (e.g., the View, Activity, Context, and Fragment classes), webkit (e.g., the WebSettings and WebView classes), and system manager (e.g., the AppOpsManager and UserManager classes). These classes nearly occupy all the top 20 misused ones.

center

Figure 15: A case study of the DSDK issue with incompatibility effect: Solo VPN.

Case study: Solo VPN. To demonstrate the impact of incompatibility DSDK issues, we identify a problematic app in our dataset and try to make it crash at the runtime. However, it is non-trivial to dynamically achieve this because a crash point may hide deep in certain paths or under certain conditions, which is why the previous work, CiD Li et al. (2018), requested developers themselves to help validate their detection results 6. To simplify our testing, we intentionally targeted at the VPN apps based on the observation that some VpnService APIs require Android 5.0 at the API level 21. After testing a few VPN apps in our dataset, we quickly identified a buggy app, Solo VPN (co.solovpn, version: 1.32), which crashed immediately after we clicked the “Connect” VPN button on an Android 4.1 device. Fig. 15 shows the alert dialog popped up, stating that “Unfortunately, SoloVPN has stopped”.

4.4 RQ3: Inconsistency Results with Security Effect

In this subsection, we present a total of three findings regarding RQ3.

Finding 3-1: Around 2% apps still set an outdated targetSdkVersion attribute when a common WebView API is vulnerable, making them exploitable by remote code execution. As explained in Sec. 2.2.2, we measure inconsistency results with the security effect by analyzing each app’s addJavascriptInterface() API call and the declared targetSdkVersion attribute. In our dataset, we first find that 2,791 apps invoke the addJavascriptInterface() API, which suggests that calling this WebView API is necessary in many apps. However, 484 of them, i.e., 2.1% of the entire dataset of 22,687 apps, still set an outdated targetSdkVersion attribute below level 17, making them not only exploitable on Android below 4.2 but also vulnerable on higher versions of Android platforms. This could be avoided if their targetSdkVersion values are updated.

Finding 3-2: Our bytecode search of addJavascriptInterface() invocation helps reduce 12.2% false positives. Recall from Sec. 3.3.2 that we perform bytecode search to check whether an addJavascriptInterface() API call is invoked by a valid class. We find that without such checking, 551 apps can be detected with vulnerable combination of addJavascriptInterface() and targetSdkVersion. In other words, our search of addJavascriptInterface() invocation avoids 67 (551 - 484) apps from being mistakenly marked with security inconsistency. Hence, we conclude that our bytecode search reduces 12.2% false positives in the context of addJavascriptInterface().

center Library Class # Vulnerable Apps Lcom/flurry/android/CatalogActivity; 41 Lcom/openfeint/internal/ui/NativeBrowser; 30 Lcom/doodlemobile/gamecenter/moregames/MoreGamesActivity; 19 Lcom/gau/go/launcherex/theme/classic/FullScreenAdWebPage; 17 Lcom/amazon/ags/html5/overlay/GameCircleUserInterface; 13

Table 3: The top five library classes that introduce addJavascriptInterface() API call in vulnerable apps and the number of apps affected.

Finding 3-3: Around a half of the vulnerable apps invoke addJavascriptInterface() because of their embedded third-party libraries. Our approach can also determine whether the addJavascriptInterface() API is invoked by app’s own code or embedded by a third-party library. It turns out that 214 (44.2%) of 484 vulnerable apps invoke addJavascriptInterface() because of their embedded third-party libraries. In particular, five libraries affect at least 10 vulnerable apps each. Table 3 lists their class names and the number of apps affected. We can see that the popular Yahoo Flurry SDK 25 and OpenFeint Game SDK 40 cause some apps with outdated targetSdkVersion to become vulnerable.

This finding gives two implications. First, developers must check whether a third-party library invokes some vulnerable APIs before embedding it into apps. Second, library producers also need to ensure certain dangerous APIs are invoked only in safe versions of Android platforms, because a library can be used in any app with all kinds of targetSdkVersion values.

Case study: Exsoul Browser. To demonstrate the impact of insecure DSDK issues, we try to exploit a problematic app in our dataset. To exploit addJavascriptInterface vulnerabilities, an adversary needs to inject a piece of malicious Javascript code into a vulnerable WebView-based interface in the victim app. He or she could achieve this either by intercepting the HTTP traffic via a Man-In-The-Middle proxy or by tricking victim users to directly browse a malicious website. We chose the second more convenient way and directly targeted at the browser apps in our dataset for tests. There was only one browser app, Exsoul Browser (com.exsoul), reported with DSDK security problems. We used it to browse a demonstration exploit website that we prepared before, http://www4.comp.polyu.edu.hk/~appsec/about/rceNew.html, which would output “Has RCE Vulnerability” if the tested browsing interface is vulnerable. As shown in Fig. 16, we successfully validate the addJavascriptInterface vulnerability in Exsoul Browser on an Android 4.1 device. We also find that Exsoul Browser exposed a Javascript interface named “android”, which allows a malicious website to execute arbitrary commands by simply invoking this Javascript code: android.getClass().forName("java.lang.Runtime").getMethod("getRuntime",null).invoke(null,null).exec(cmdArgs).

center

Figure 16: A case study of the DSDK issue with security effect: Exsoul Browser.

4.5 RQ4: Performance Metrics of Our Approach

In this section, we evaluate performance metrics of our approach to answer RQ4.

Finding 4-1: Our approach achieves good scalability with an average time of 5.39s and the analysis time of 90% apps in less than 10 seconds. This makes our approach suitable for online vetting. In Fig. 17, we present CDF plot of the amount of time required for our approach to analyze each app. We can see that more than 50% apps can be analyzed in less than five seconds each, with the median time of 4.75s. The average analysis time of all the 22,687 apps is only 5.39s. These results indicate that our approach achieves good scalability, therefore suitable for online vetting. Therefore, app markets can deploy our approach to timely notify developers the DSDK inconsistency in their apps.

center

Figure 17: CDF plot of the amount of time required for our approach to analyze each app.

In contrast, dataflow-based approaches Li et al. (2018) He et al. (2018) suffer from the scalability problem. Specifically, CiD Li et al. (2018) failed to analyze 387 apps (out of a dataset of 2,000 apps) due to timeouts and bugs. This 19.4% timeout or failure rate makes it infeasible for online vetting, let alone performance statistics were also not clear for those successfully analyzed. On the other hand, IctApiFinder He et al. (2018) takes 3 minutes and 45 seconds to analyze only an app of 8MB (the app is available via historical versions on https://f-droid.org/en/packages/com.nextcloud.client/), a size much smaller than the average size (25MB) of our dataset. This suggests that IctApiFinder is impractical to perform online vetting of a modern app dataset from Google Play (all apps evaluated by IctApiFinder were open-source apps from the F-Droid website).

Finding 4-2: A further correlation analysis between analysis time and app size shows that the performance of our approach is approximately a linear relationship with DEX file size of the app. We further statistically demonstrate that the performance of our approach is always under control regardless of the app size. This can be evaluated by performing a correlation analysis between analysis time and app size. In Fig. 18, we draw a scatter plot of the relationship between analysis time and the size of DEX file of the app (APK file contains both bytecode and resource files while DEX file is only for bytecode). According to this figure, the analysis time and DEX file size are approximately in a linear relationship, at the rate of around 30 seconds for a 40MB DEX file (note that we count the file size of multiple DEX files if any). There are some outliers of small apps with more analysis time (e.g., five apps under 20MB exceeding 30s), which is largely because these apps involve much more vulnerable API calls to search. On the other hand, the outliers of large apps with less analysis time is due to unused third-party libraries embedded. Overall, the linear relationship between analysis time and app size indicates that our approach can achieve good performance even with large apps.

center

Figure 18: Scatter plot of the relationship between analysis time and DEX size.

4.6 RQ5: The Updatability of The Buggy Apps

In this subsection, we continue to understand the updatability of apps that were measured with DSDK issues in our dataset, i.e., whether they are still maintained by their developers. This is important because compared with the updatable apps that could eventually address their DSDK issues via the app updates, out-dated apps have no maintainers to periodically update and fix their DSDK problems. To study to what extent this problem is, we use 8,359 unique apps (8,019 incompatible apps and 484 vulnerable apps) that were reported with potential DSDK problems in Sec. 4.3 and 4.4 for the analysis. Since our dataset was crawled in November 2018, we collected the latest release date of those buggy apps on Google Play in early December 2019. We believe that this one-year time frame allows us to test the app updatability by analyzing whether apps have been updated in 2019 or not. We show our finding in the next paragraph.

center

Figure 19: Bar chart of the distribution of apps that were measured with DSDK issues in our dataset and their latest release years on Google Play.

Finding 5: Around 20% of the 8,359 buggy apps were never updated in 2019, and 13.7% have been deprecated from Google Play, causing a total of 33.7% apps out-dated. Fig. 19 shows a bar chart of the distribution of apps that were measured with DSDK issues in our dataset and their latest release years on Google Play. According to this figure, 5,539 (66.3%) apps have been updated at least once in 2019, which allows their developers to upgrade DSDK versions to fix their DSDK problems. However, there are still one third of the measured apps not updatable. Specifically, the latest release years of 1,674 (20%) apps have been 2018, 2017, 2016, and even before 2015. Besides these “old” apps, we find that 1,146 (13.7%) apps are even deprecated from Google Play for various reasons (e.g., being taken down by developer themselves or violating the advertisement policy on Google Play). No matter for what reasons, they are no longer on Google Play due to no further maintenance, whereas their previously downloaded versions could still be in user phones. Both old and deprecated apps incur a large number of out-dated apps in the wild, with a total of 33.7% in our dataset. Therefore, it is worthwhile for researchers to further develop techniques for automatically fixing DSDK issues in those out-dated apps.

5 Implications

In this section, we further present two implications on the qualitative analysis of identified DSDK problems and actionable countermeasures for developers.

Implication 1: Android’s original design of the DSDK mechanism, despite the good intention, does not satisfy the expectation of developers’ real usage. One major problem is that it is difficult to evolve the DSDK versions correctly when apps are updated with new or deprecated APIs. The original DSDK design is a static mechanism, and there was no automatic mechanism to dynamically update the outdated DSDK versions. However, quite a number of apps are updated frequently, e.g., 1,448 of the top 10,713 apps studied in 2014 were updated on a bi-weekly basis or even more frequently McIlroy et al. (2016). In this way, it is challenging for developers to maintain the DSDK versions while they are already busy with the functionality update. Moreover, the addJavascriptInterface() vulnerabilities reported in Sec. 4.4 indicate that there is a semantic gap between the targetSdkVersion design and developers’ understanding. Indeed, it is somehow confusing that lower versions of API behaviors would be used even when an app is running on a higher version of the Android platform (see Sec. 2). To our knowledge, this is not the first case where a misunderstanding between Android’s design and developers’ knowledge happens. Another notable example is that Android once by default exported all content provider components that have no android:exported attribute defined, which caused a large number of vulnerable apps Zhou and Jiang (2013) since developers did not expect their content provider components to be exported.

Implication 2: To mitigate the DSDK problems, the Android community could take countermeasures from different levels. We list the following three actionable countermeasures that can be adopted by different stakeholders:

  • Google Android could provide better IDE (integrated development environment) to help developers check DSDK versions before uploading their apps to the markets. Such checking is ideally automatic and should launch whenever there are new changes in apps. We have seen a good trend in the recent Android Studio IDE, which performs more user-friendly DSDK checking than its predecessor, i.e., the Android Lint plugin in Eclipse.

  • The app markets can deploy our approach to perform a quick and mandatory checking of each app uploaded. The suspicious DSDK conflicts and recommendations are to be either approved or dismissed. In this way, we can guarantee that developers are at least aware of potential DSDK problems in their apps.

  • As the last line of defense, end-user Android devices can dynamically upgrade DSDK versions in victim apps or enforce mandatory access control Wu et al. (2018) so that they are no longer incompatible or vulnerable at the operating system level. This is especially important for the apps no longer maintained (see Sec. 4.6).

6 Threats To Validity

In this section, we summarize some major threats to the validity of our study.

First, same as typical Android static analysis, our approach does not handle Java reflection, dynamic code loading, native code, and complicated code obfuscation. However, some apps may employ these mechanisms to access certain Android APIs. If a such API call has inconsistency issues, a false negative would appear. Since these code protection mechanisms are usually used in malware, our statistical results of popular apps will be less affected and we will consider these mechanisms to our future work.

Second, although our bytecode search in Sec. 3.3.2 has minimized false positives caused by VERSION.SDK_INT checking and uninvoked third-party libraries, it is theoretically less accurate than dataflow-based approaches. Fortunately, in our deployment model, we can rely on developers to manually check and correct inconsistency reported by our approach. Moreover, as evidenced in Sec. 4.3, the manual effort required in such checking is also limited — around 80% apps are reported with fewer than 10 inconsistent API calls each, which is manageable for developers to perform a one-time manual check. Due to this limitation, the measurement results reported in this paper represent a upper bound of all potential DSDK problems (under the condition that the common analysis difficulties above are not considered). This satisfies our objective of conducting a comprehensive DSDK study, although it is not suitable for bug detection.

Third, the consistency detection in this paper focuses on changed APIs, but there are also added and removed Java/Android fields during the SDK evolution. To build the mapping between fields and SDK versions, we found that we can leverage the same document analysis method in Sec. 3.2, because the api-versions.xml file also records added, removed, and deprecated fields in all Android classes. By inputting this mapping to our app analysis, we can extend our consistency detection to evolved Android fields as well in our future work.

7 Related Work

In this section, we summarize some related research on declared SDK versions, Android APIs, and Android app static analysis.

7.1 Research on Declared SDK Versions

There were no systematic studies on declared SDK versions previously, except for some specific studies on targetSdkVersion or minSdkVersion in different scenarios. Notably, Wu and Chang Wu and Chang (2014) showed that due to using outdated targetSdkVersion versions, many Android browser apps were vulnerable to file:// vulnerabilities. They further demonstrated more security consequences caused by outdated targetSdkVersion versions Wu and Chang (2015). Following this line of research, Mutchler et al. Mutchler et al. (2016) conducted a large-scale measurement of multiple vulnerabilities affected by fragmented targetSdkVersion versions. Wei et al. Wei et al. (2016) also studied Android fragmentation with the focus on compatibility issues. In particular, our preliminary conference version of this work Wu et al. (2017b) has motivated two recent follow-up works Li et al. (2018) He et al. (2018) on detecting compatibility issues caused by inappropriate minSdkVersion versions. Compared to all these works, our study is the first systematic work on measuring all kinds of DSDK versions and their (in)consistency with API calls.

7.2 Android API Studies

Besides DSDK and fragmentation, our paper is also related to prior studies on Android APIs or SDKs. Among these studies, the work performed by McDonnell et al. McDonnell et al. (2013) is the closest to our paper. They also studied the Android API evolution, but their focus was how client apps follow Android API changes whereas we focus on the consistency between apps’ DSDK and API calls. Other related works have studied the correlation between apps’ API change and their success Linares-Vásquez et al. (2013), the deprecated API usage in Java-based systems Brito et al. (2016), the inaccessible APIs in Android framework and their usage in third-party apps Li et al. (2016); and the Android Alarm API usage and their impacts to network latency Almeida et al. (2016). In particular, the work conducted by Almeida et al. Almeida et al. (2016) analyzed the targetSdkVersion in apps that invoke Alarm APIs. Additionally, several security papers analyzed the mappings between Android APIs and their permissions Felt et al. (2011) Au et al. (2012) Wei et al. (2012).

7.3 Android App Static Analysis

A large number of Android studies have leveraged static analysis in many applications over past years. The major methodology can be roughly classified into control-flow based reachability analysis and dataflow-based taint analysis. For the reachability analysis, RiskRanker 

Grace et al. (2012b) and Woodpecker Grace et al. (2012a) are two pioneer representative works in the domains of malware detection and vulnerability discovery, respectively. They tested the reachability from entry points to sink APIs. In contrast, more prior works employed dataflow analysis to taint the propagation flows of an interested data variable. CHEX Lu et al. (2012), FlowDroid Arzt et al. (2014), and Amandroid Wei et al. (2014) are three representative works in this research direction. In particular, FlowDroid and Amandroid have been used or customized in many follow-up static analysis tools (e.g., Yang et al. (2015) Avdiienko et al. (2015) Li et al. (2015) He et al. (2018) Shao et al. (2016) Jia et al. (2017)). One common thing between reachability analysis and dataflow analysis is that they both require to generate an app call graph, the precision of which affects the entire analysis accuracy. However, generating a high-precision call graph requires expensive pointer analysis Wei et al. (2014), and the scalability concern is why we proposed lightweight bytecode search for our online vetting of API-SDK inconsistency in this paper.

8 Conclusion and Future Work

In this paper, we conducted a systematic study of declared SDK versions in Android apps, a modern software mechanism that has received little attention. We measured the current practice of declared SDK versions or DSDK versions in a large set of 22,687 modern apps, and the inconsistency between DSDK versions and their host apps’ API calls. To facilitate the analysis that can be readily deployed by app markets for online vetting, we proposed a robust and scalable approach that operates on the Android bytecode level and employs a lightweight bytecode search for app analysis. We have obtained some interesting new findings, including (i) 4.76% apps do not claim the targeted DSDK versions, although this percentage has significantly dropped over recent three years, (ii) around 50% apps under-set the minimum DSDK versions and could incur runtime crashes, but fortunately, only 11.3% apps could crash on Android 6.0 and above, and (iii) around 2% apps, due to under-claiming the targeted DSDK versions, are potentially exploitable by remote code execution, and a half of them invoke the vulnerable API via embedded third-party libraries. In the future, we plan to help app developers and app markets fix DSDK issues, and further improve our approach to mitigate some threats to validity.

References

  • [1] aapt: Android Asset Packaging Tool. Note: http://elinux.org/Android_aapt Cited by: §3.3.1.
  • K. Allix, T. F. Bissyandé, J. Klein, and Y. L. Traon (2016) AndroZoo: collecting millions of Android apps for the research community. In Proc. MSR, Cited by: §4.1.
  • M. Almeida, M. Bilal, J. Blackburn, and K. Papagiannaki (2016) An empirical study of Android alarm usage for application scheduling. In Proc. Springer PAM, Cited by: §7.2.
  • [4] Android Distribution dashboard. Note: https://developer.android.com/about/dashboards/ Cited by: §1, §4.2, §4.3.
  • [5] Android Platform codenames, versions, and API levels. Note: https://source.android.com/source/build-numbers.html Cited by: §2.1.
  • [6] API compatibility issues in the emdete/tabulae project. Note: https://github.com/emdete/tabulae/issues/12 Cited by: §4.3.
  • [7] apktool. Note: https://ibotpeaches.github.io/Apktool/ Cited by: §3.3.1.
  • [8] App security best practices - Android developers. Note: https://developer.android.com/topic/security/best-practices Cited by: Table 1.
  • S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Traon, D. Octeau, and P. McDaniel (2014) FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In ACM PLDI, Cited by: §1, §3.3.2, §7.3.
  • K. Au, Y. Zhou, Z. Huang, and D. Lie (2012) PScout: analyzing the Android permission specification. In Proc. ACM CCS, Cited by: §7.2.
  • V. Avdiienko, K. Kuznetsov, A. Gorla, A. Zeller, S. Arzt, S. Rasthofer, and E. Bodden (2015) Mining apps for abnormal usage of sensitive data. In Proc. ACM ICSE, Cited by: §1, §3.3.2, §7.3.
  • M. Backes, S. Bugiel, and E. Derr (2016) Reliable third-party library detection in Android and its security applications. In Proc. ACM CCS, Cited by: §3.3.2.
  • R. Bonett, K. Kafle, K. Moran, A. Nadkarni, and D. Poshyvanyk (2018) Discovering flaws in security-focused static analysis tools for Android using systematic mutation. In Proc. USENIX Security, Cited by: §1.
  • G. Brito, A. Hora, M. T. Valente, and R. Robbes (2016) Do developers deprecate APIs with replacement messages? a large-scale analysis on Java systems. In Proc. IEEE SANER, Cited by: §7.2.
  • [15] Detecting remote code execution vulnerabilities in Android apps. Note: https://sites.google.com/site/androidrce/ Cited by: Table 1.
  • [16] Disassemble Android dex files. Note: http://blog.vogella.com/2011/02/14/disassemble-android-dex/ Cited by: §3.3.2.
  • J. Drake (2014) On the WebView addJavascriptInterface saga. Note: http://www.droidsec.org/news/2014/02/26/on-the-webview-addjsif-saga.html Cited by: §2.2.2.
  • [18] Enable multidex for apps with over 64K methods. Note: https://developer.android.com/studio/build/multidex Cited by: §1, §3.3.2.
  • A. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner (2011) Android permissions demystified. In Proc. ACM CCS, Cited by: §7.2.
  • M. Grace, Y. Zhou, Z. Wang, and X. Jiang (2012a) Systematic detection of capability leaks in stock Android smartphones. In Proc. ISOC NDSS, Cited by: §7.3.
  • M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang (2012b) RiskRanker: scalable and accurate zero-day Android malware detection. In Proc. ACM MobiSys, Cited by: §7.3.
  • D. He, L. Li, L. Wang, H. Zheng, G. Li, and J. Xue (2018) Understanding and detecting evolution-induced compatibility issues in Android apps. In Proc. ACM ASE, Cited by: 1st item, §1, §3.3.2, §3.3.2, §4.1, §4.5, §7.1, §7.3.
  • [23] How to fix Fragment Injection vulnerability. Note: https://support.google.com/faqs/answer/7188427 Cited by: Table 1.
  • [24] IDC: Smartphone Market Share. Note: https://www.idc.com/promo/smartphone-market-share/os Cited by: §1.
  • [25] Integrate Flurry SDK for Android. Note: https://developer.yahoo.com/flurry/docs/integrateflurry/android/ Cited by: §4.4.
  • Y. Jia, Q. Chen, Y. Lin, C. Kong, and Z. Mao (2017) Open doors for Bob and Mallory: open port usage in Android apps and security implications. In Proc. IEEE EuroS&P, Cited by: §7.3.
  • [27] JS-Binding-Over-HTTP Vulnerability and JavaScript Sidedoor: Security Risks Affecting Billions of Android App Downloads. Note: https://www.fireeye.com/blog/threat-research/2014/01/js-binding-over-http-vulnerability-and-javascript-sidedoor.html Cited by: §3.3.2.
  • L. Li, A. Bartel, T. F. Bissyandé, J. Klein, Y. L. Traon, S. Arzt, S. Rasthofer, E. Bodden, D. Octeau, and P. D. McDaniel (2015) IccTA: detecting inter-component privacy leaks in Android apps. In Proc. ACM ICSE, Cited by: §7.3.
  • L. Li, T. F. Bissyandé, Y. L. Traon, and J. Klein (2016) Accessing inaccessible Android APIs: an empirical study. In Proc. IEEE ICSME, Cited by: §7.2.
  • L. Li, T. F. Bissyandé, H. Wang, and J. Klein (2018) CiD: automating the detection of API-related compatibility issues in Android apps. In Proc. ACM ISSTA, Cited by: 1st item, §3.2, §4.3, §4.5, §7.1.
  • M. Li, W. Wang, P. Wang, S. Wang, D. Wu, J. Liu, R. Xue, and W. Huo (2017) LibD: scalable and precise third-party library detection in Android markets. In Proc. ACM ICSE, Cited by: §3.3.2.
  • M. Linares-Vásquez, G. Bavota, C. Bernal-Cárdenas, M. D. Penta, R. Oliveto, and D. Poshyvanyk (2013) API change and fault proneness: a threat to the success of Android apps. In Proc. ACM FSE, Cited by: §7.2.
  • L. Lu, Z. Li, Z. Wu, W. Lee, and G. Jiang (2012) CHEX: statically vetting Android apps for component hijacking vulnerabilities. In Proc. ACM CCS, Cited by: §7.3.
  • E. Mariconti, L. Onwuzurike, P. Andriotis, E. D. Cristofaro, G. Ross, and G. Stringhini (2017)

    MaMaDroid: detecting Android malware by building markov chains of behavioral models

    .
    In Proc. ISOC NDSS, Cited by: §1, §3.3.2.
  • T. McDonnell, B. Ray, and M. Kim (2013) An empirical study of API stability and adoption in the Android ecosystem. In Proc. IEEE ICSM, Cited by: §3.2, §7.2.
  • S. McIlroy, N. Ali, and A. E. Hassan (2016) Fresh apps: an empirical study of frequently-updated mobile apps in the Google play store. Empirical Software Engineering Volume 21, Issue 3. Cited by: §5.
  • [37] Multiple APK support - Android Developers. Note: https://developer.android.com/google/play/publishing/multiple-apks Cited by: §3.1.
  • P. Mutchler, Y. Safaei, A. Doupe, and J. Mitchell (2016) Target fragmentation in Android apps. In Proc. IEEE Mobile Security Technologies (MoST), Cited by: Table 1, §7.1.
  • D. Octeau, S. Jha, and P. McDaniel (2012) Retargeting Android applications to Java bytecode. In Proc. ACM FSE, Cited by: §1, §3.3.2.
  • [40] OpenFeint is the largest mobile social gaming network in the world. Note: http://www.openfeint.com/ Cited by: §4.4.
  • X. Pan, X. Wang, Y. Duan, X. Wang, and H. Yin (2017) Dark hazard: learning-based, large-scale discovery of hidden sensitive operations in Android apps. In Proc. ISOC NDSS, Cited by: §1, §3.3.2.
  • [42] Security tips - Android developers. Note: https://developer.android.com/training/articles/security-tips Cited by: Table 1.
  • Y. Shao, J. Ott, Y. J. Jia, Z. Qian, and Z. M. Mao (2016) The misuse of Android Unix domain sockets and security implications. In Proc. ACM CCS, Cited by: §7.3.
  • [44] The AndroidManifest.xml file. Note: http://developer.android.com/guide/topics/manifest/manifest-intro.html Cited by: §2.1.
  • [45] The uses-sdk manifest element. Note: http://developer.android.com/guide/topics/manifest/uses-sdk-element.html Cited by: §2.1.
  • F. Wei, S. Roy, X. Ou, and Robby (2014) Amandroid: a precise and general inter-component data flow analysis framework for security vetting of Android apps. In Proc. ACM CCS, Cited by: §1, §3.3.2, §7.3.
  • L. Wei, Y. Liu, and S. Cheung (2016) Taming Android fragmentation: characterizing and detecting compatibility issues for Android apps. In Proc. ACM ASE, Cited by: §7.1.
  • X. Wei, L. Gomez, I. Neamtiu, and M. Faloutsos (2012) Permission evolution in the android ecosystem. In Proc. ACM ACSAC, Cited by: §7.2.
  • D. Wu, R. K. C. Chang, W. Li, E. K. T. Cheng, and D. Gao (2017a) MopEye: opportunistic monitoring of per-app mobile network performance. In Proc. USENIX Annual Technical Conference, Cited by: §2.2.1.
  • D. Wu and R. K. C. Chang (2014) Analyzing Android browser apps for file:// vulnerabilities. In Proc. Springer Information Security Conference (ISC), Cited by: Table 1, §7.1.
  • D. Wu and R. K. C. Chang (2015) Indirect file leaks in mobile applications. In Proc. IEEE Mobile Security Technologies (MoST), Cited by: §7.1.
  • D. Wu, Y. Cheng, D. Gao, Y. Li, and R. H. Deng (2018) SCLib: a practical and lightweight defense against component hijacking in Android applications. In Proc. ACM Conference on Data and Applications Security and Privacy (CODASPY), Cited by: 3rd item.
  • D. Wu, D. Gao, R. K. C. Chang, E. He, E. K. T. Cheng, and R. H. Deng (2019) Understanding open ports in Android applications: discovery, diagnosis, and security assessment. In Proc. ISOC NDSS, Cited by: §3.3.2.
  • D. Wu, X. Liu, J. Xu, D. Lo, and D. Gao (2017b) Measuring the declared SDK versions and their consistency with API calls in Android apps. In Proc. Springer International Conference on Wireless Algorithms, Systems, and Applications (WASA), Cited by: 1st item, §1, §3.1, §3.2, §4.2, §4.2, §7.1.
  • D. Wu, X. Luo, and R. K. C. Chang (2014) A Sink-driven Approach to Detecting Exposed Component Vulnerabilities in Android Apps. CoRR abs/1405.6282. External Links: Link Cited by: §3.3.1.
  • W. Yang, X. Xiao, B. Andow, S. Li, T. Xie, and W. Enck (2015) AppContext: differentiating malicious and benign mobile app behaviors using context. In Proc. ACM ICSE, Cited by: §1, §3.3.2, §4.1, §7.3.
  • Y. Zhou and X. Jiang (2013) Detecting passive content leaks and pollution in Android applications. In Proc. ISOC NDSS, Cited by: §5.