The smartphone is the most iconic and ubiquitous mobile device of the past decade. However, the smartphone is far from the only mobile device available to consumers given the introduction of device types such as the tablet, smartwatch, and the introduction of mobile connectivity to the personal computer (PC) (primarily the laptop). In fact, many device manufacturers now have products in all of these mobile device lines.
Given this diversity, understanding how certain device types may substitute for other device types (specifically a shift of usage to a new device) or even prompt additional (non-substituted) usage is important for, as an example, understanding the relative positioning of these device types in the mobile ecosystem. Yet despite this importance, few studies have explicitly analyzed device type substitution (Matthews et al., 2009; Shmorgun et al., 2013; Müller et al., 2015; Finley et al., 2016; Finley and Soikkeli, 2017). Furthermore, existing studies have either relied on survey or user completed diary methods111These methods are susceptible to recall bias (de Reuver et al., 2012). (Matthews et al., 2009; Shmorgun et al., 2013; Müller et al., 2015) or neglected matching methods to control for covariate imbalances between compared groups (Finley and Soikkeli, 2017).
In this work, we examine device type substitution through an analysis of multidevice (smartphone, tablet and PC) and multiplatform (Android, iOS, etc.) usage data from a large US based user panel. Importantly, the data collection method is device-based (users installed custom device-based monitoring applications), and we use a robust matching method. Specifically, we use a coarsened exact matching (CEM) strategy to create matched groups where the major difference between the groups (known as the treatment) is ownership of a particular device type. Then regression allows for the estimation of the effect of that device type ownership on usage of another device type.
Through our analysis, we test several different device type substitution hypotheses partially informed by prior research (Finley et al., 2016, Section 5.2) (Finley and Soikkeli, 2017, Section 5.4). The tested hypotheses are as follows:
H1: Tablet ownership decreases smartphone usage
H2: Tablet ownership decreases PC usage
H3: PC ownership has no effect on smartphone usage
H4: Tablet ownership increases total device usage222Total device usage is the sum of usage of all smartphone, tablet, and PC devices of the user.
H5: PC ownership increases total device usage
These hypotheses, if supported, can help provide support for the following more general statements:
S1: Tablets are a partial substitute for smartphones
S2: Tablets are a partial substitute for PCs
S3: PCs are not a partial substitute for smartphones
S4: Tablets prompt additional (non-substituted) usage
S5: PCs prompt additional (non-substituted) usage
The remainder of this work is organized as follows. Sections 2 details background information including a summary of related work and an overview of the dataset. Section 3 describes the preprocessing steps including several data issues and the robust matching method. Finally, Section 4 reports the analysis results including hypothesis testing, Section 5 discusses theoretical reasoning, limitations, and examples of implications, and Section 6 concludes the work.
This section describes the background information including related work and a dataset overview.
2.1. Related Work
Related work in mobile device type substitution is somewhat sparse. Studies can generally be classified into two broad categories based on the data collection methodology.
Several studies have used interview, survey, or electronic diary methods that ask users to recall their usage and device substitution behavior.
Matthews et al. (2009) studied device type substitution through semi-structured interviews of users. The results detailed that users substitute smartphones for laptops and vice versa in some situations (mainly personal usage rather than work usage) but that a majority of usage is instead additional. The interviews elicited a few substitution examples such as users deferring reading of full news articles until they could access their larger display laptops. Shmorgun et al. (2013) analyzed device type substitution through a survey of 101 mobile device users dispersed through several different countries. The analysis detailed that users prefer specific device types for specific tasks even though the general usage volume of service types across device types is roughly equal. Finally, Müller et al. (2015) used both surveys and a self-reporting electronic diary method to compare smartphone and tablet usage of 176 US-based mobile users. The comparison detailed that total device usage was highest among users with both smartphones and tablets, thus suggesting these devices prompt at least some additional usage (rather than simply substituted usage).
Other studies have used data from device-based monitoring of mobile usage.
Finley and Soikkeli (2017) studied device type substitution by comparing usage between a group of users with only smartphones and a group with both smartphones and tablets. The results indicated that tablet usage is partially substituted usage from smartphones and partially additional usage. The fraction of each of these cases was about half. Also, Finley et al. (2016) explored device type substitution via correlation analysis of usage statistics of several different user groups (smartphone and tablet owners, smartphone and PC owners, etc.). In other words, instead of looking for the presence of substitution across groups, the work looked at the level of substitution within a group. The results suggested a significant substitution effect for users with both tablets and PCs; specifically the more applications and usage time on a tablet the less on a PC.
Finally, several recent mobile and internet related studies have similarly applied coarsened exact matching for reducing covariate imbalance before group comparison.
For example, Arora et al. (2017) used coarsened exact matching when analyzing the effect of developers providing free versions of paid mobile applications on paid applications adoption rate. Similarly, Wen and Zhu (2017) applied coarsened exact matching when analyzing the effect of the platform owner (in this case Google) entering a specific area of the mobile application market (with native features or their own app). While Cotten et al. (2014) applied coarsened exact matching in analyzing the effect of internet usage on depression in retired adults in the USA.
The primary dataset consists of one month of device usage data from a subset of active users of a large ongoing United States based user panel organized by Verto Analytics333http://vertoanalytics.com/. Hereafter, we refer to the large on-going Verto panel as the general panel and the one-month subset of active users as the active dataset.
Regarding the general panel, users are recruited online and given an initial recruitment survey that asks about the devices they own. Users are instructed to install custom monitoring applications to all of their applicable devices (in other words, the devices they report owning including smartphone, tablet, and PC). Monitoring applications are available for Google Android, Apple iOS, and Microsoft Windows. We discuss Apple macOS later. The monitoring applications log events such as, for Android, an application moving to the foreground of the device display. Only users that install monitoring applications to all their applicable devices are considered for the panel. All users are paid for participation. Also, users rarely install the monitoring applications to their work devices444Work devices generally do not permit users to grant the broad permissions the monitoring application requires..
For all devices, device usage time is calculated as the sum of device application sessions. For Android devices, application sessions are based on an application entering and then leaving the foreground of the display. Similarly for iOS devices, application sessions are based on a combination of display state and system calls that indicate an application entering the foreground of the display555The identification of the specific application on the foreground of the iOS device occurs through network traffic analysis, though identifying the specific application is not relevant for our analysis. and then being replaced by a new application entering the foreground. Finally, for Windows devices, application sessions are based on an application window gaining and then losing focus.
For some applications the display is typically off during usage (for example, calling and music applications). For technical reasons we do not include this display-off usage. We discuss the implications of this further in Section 5.1. Concerning overlapping usage (between devices), we include such usage because our hypotheses are formulated in terms of total usage and overlapping usage is both possible and relatively common (Finley and Soikkeli, 2017). We apply maximum session length timeouts of 3 hours for iOS/Android sessions and 8 hours for Windows sessions to limit sessions where the display is forced on continuously but the user may not be using the device. Finally, we exclude users that report owning an Apple PC since a monitoring application for Apple macOS is unavailable666For completeness, we note that Verto Analytics does currently have an Apple macOS monitor but did not at the time of the data collection. and we do not have PC usage information for these users.
The active dataset consists of one month (November 2016) of device usage data from 5158 active users. The dataset additionally includes demographic and smartphone technographic data for these users. The dataset does not include personally identifying information.
We define a user as active if they used their smartphone on at least 23 days of the month. This day threshold represents a trade-off between the sample size (number of analyzed users) and sample definition (ensuring analyzed users are truly active panel members throughout the month777In other words, ensuring that users that essentially drop out of the panel are not included.). Figure 1 illustrates the normalized number of users considered active with different day thresholds including the selected threshold of 23 days. We do not define active thresholds for tablet and PC usage because the notion of activeness is less clear for these device types that are not necessarily everyday drivers. We also note that even our superset of both active and non-active users (about 8000 users total) is a subset of the full general panel.
In terms of representativeness, general panel recruitment is performed with the purpose of obtaining a nationally representative panel and thus the general panel is relatively diverse. However, all opt-in panels inherently use non-probability sampling and thus representativeness is a concern. We refer toHays et al. (2015)
for a thorough discussion of non-probability internet sampling. Furthermore, as mentioned, the recruitment process uses a recruitment survey that screens potential users to improve the demographic and technographic match between the accepted users and the population (known as a quota-sampling approach).
For reference, we provide a summary of demographic data for active dataset users along with demographic data for US smartphone users in general in Table 1. There are several demographic discrepancies. We do not attempt to calculate and use, for example, raking weights to remove these discrepancies partly because our matching package does not support such weights. In any case, we agree with Church et al. (2015) that these types of studies are mainly about the panel populations themselves and the value is primarily in allowing researchers and the community to compare and contrast different experiences with different user populations to build a comprehensive understanding of the variety of mobile users and their unique behaviors.
|Demographic||Active Users||US Smartphone Usersa|
|Mean Age (Years)b||35.24 (12.29)||42.19 (15.72)|
|Gender (% Male)||30.57||50.15|
|Marital Status (% Married)c||42.94||56.98|
|Employment Status (% Employed)d||59.09||72.15|
|HHe Income (% <$50K)||68.15||44.58|
|Mean HHe Size||3.00 (1.58)||3.12 (1.58)|
|Mean Children in HHe||0.88 (1.19)||0.75 (1.10)|
|Race (% White)||74.16||74.00|
|Ethnicity (% Hispanic)||10.19||15.24|
US smartphone user demographic data is from Pew Research survey (n=3015, subpop with smartphone n=2310) (Pew Internet and American Life Project, 2016) from September-November 2016. The survey utilizes weighting to population parameters of census data to create nationally representative results (refer to Pew Research Center (2017)). Verto Analytics also performs its own national surveys, we utilize the Pew Research survey only for brevity.
All mean values also include standard deviations
Married includes responses for both married and domestic partnership so that the definition matches the coarsening used in CEM (see Table 3)
Employed includes responses for full-time, part-time, and self employment so that the definition matches the coarsening used in CEM (see Table 3)
For completeness, we note that the demographic data includes age, gender, marital status, employment status, household income, household size, children in household, race, and ethnicity. While the smartphone technographic data includes smartphone model, platform, display size, display pixel density, display-to-body ratio, and RAM. The technographic data for each smartphone model was collected from public sources such as gsmarena.com. The technographic data is included (in addition to the demographic data) in the matching method and analysis under the assumption that smartphone technographics are a rough proxy for the technical sophistication of the user. Therefore, the technographic data complements the demographic data since such technical sophistication is unlikely to be predictable by demographics alone (see the correlations between demographics and technographics in Figure 3). We detail a summary of technographic data and device type ownership for active dataset users in Table 2.
|Smartphone Display Size (Inches)a||4.98 (0.51)|
|Smartphone Display Density (PPI)b||380.74 (126.22)|
|Smartphone Display-to-Body Ratio (%)||68.76 (5.08)|
|Smartphone RAM (GB)||1.98 (0.95)|
|Tablet Ownership (% w/ Tablet)||16.38|
|PC Ownership (% w/ PC)||50.72|
|Tablet and PC Ownership (% w/ Tablet & PC)||7.33|
All mean values also include standard deviations
Pixels per inch
2.3. Hypotheses Formulation
We briefly discuss the formulation of the hypotheses. All our hypotheses are formulated given the assumption that users have and use a smartphone (which is valid for our active dataset given that the definition of active is based on smartphone ownership and usage). Alternatively we could have formulated our hypotheses and/or selected our active dataset in several other ways888For example, we could have selected a different active dataset for each hypothesis. In the case of H2, for instance, the active dataset might consist of all users active with a PC or all users active with both a PC and tablet. However, as previously discussed, given that tablets and PCs are not necessarily everyday devices the notion of an active user is less straightforward.. However, we believe that our smartphone centric focus is acceptable given the importance and ubiquity of the smartphone in the mobile ecosystem. Additionally, we do not believe that alternative formulations would significantly change our results.
Additionally, our hypotheses are formulated to analyze the effect of ownership (of one device type) on the usage (of another device type). These analyses are helpful in certain situations, for example, when the effect of acquiring a certain device type should be estimated. Alternative analyses, such as the effect of usage (of one device type) on usage of (another device type), as in Section 4.1.4), are helpful in other situations. For example, when the effect of promoting usage of a certain device type should be estimated. In general we believe these analyses are complimentary. We look to perform more robust usage on usage analyses, beyond Section 4.1.4, in future work.
This section describes the preprocessing steps including several data issues and the robust matching method.
3.1. Missing Data
In terms of missing data, 12 users of the active dataset (0.19%) are missing smartphone technographic data. These missing values are related to users with very uncommon smartphones that have ambiguous model names. Given the very small amount of missing data, for simplicity we illustrate all analysis on complete case data. In other words, we exclude these users.
3.2. Users with Multiple Devices of the Same Type
A small fraction of users have multiple devices of the same device type999Specifically, for the active dataset, 6.3% of users have multiple smartphones, 4.7% of tablet users have multiple tablets, and 7.6% of PC users have multiple PCs, for example both a laptop and desktop computer. In calculating device type usage, we use a simple approach of summing all of a user’s usage for devices of the same type. We also test an alternative approach of selecting one of the multiple devices at random, however this approach does not significantly change the analysis results. Therefore we only include the results of the summation approach.
3.3. PC Subtypes (Desktops and Laptops)
In the analysis we include both desktops and laptops in the PC device type under the assumption that the desktop and laptop sub-types are similar in terms of device type substitution. However, for robustness, we also test these subtypes separately in Section 4.1.3.
3.4. Devices Shared within a Household
Devices, especially tablets and PCs, may be shared among several members of a single household (Müller et al., 2012). For example, GlobalWebIndex (2017) reports that about 60% of users share their tablet with at least one other user, though the extent of such sharing is not directly quantified. In terms of implications for our analysis, shared usage of tablets and PCs does not directly affect testing of H1 and H3 since these hypotheses use binary ownership/non-ownership variables for tablet and PC that would not be affected by shared usage. However, shared usage does affect the testing of H2, H4, and H5. Therefore, for robustness, we also test these hypotheses with only one person household users (thus lessening the possibility of device sharing) in Section 4.1.2.
3.5. Matching Method of Coarsened Exact Matching
In order to determine the effect of a treatment such as owning a tablet on the usage of another device type such as smartphone, the treatment group and control group should be as similar as possible in all other regards (i.e. concerning covariates101010In our case these covariates are the demographic, techographic, and independent device usage variables.). In this way, the treatment effect is isolated. There are several ex-post methods for creating these groups based on analysis of covariates. The most prominent of these methods is probably propensity score matching (PSM). However, PSM has several undesirable properties and alternative methods are often preferable (King and Nielsen, 2015).
In this work, we use coarsened exact matching (CEM) via the cem package (1.1.14) (Iacus et al., 2009) in R. A few of the advantages of CEM include giving the ability to control the amount of imbalance in the final matching through ex-post decisions (specifically the selected coarsenings), meeting the congruence principle111111Specifically, the congruence principle specifies congruence between the data space and the analysis space. Refer to Iacus et al. (2012, Section 4.2) for further discussion., and being efficient in computational terms (Iacus et al., 2012).
Table 3 details the coarsening of categorical covariates (aggregating of categorical levels) and Table 4 details the coarsening of continuous covariates (discretizing of continuous covariates). Whereas Table 5 details the results of the matching including the size of the resulting groups. The specific demographic coarsenings were primarily selected based on common coarsenings used in sociological work. Whereas, the specific technographic coarsenings were based on examination of the technographic covariate histograms and prior knowledge of common smartphone characteristics.
For some covariates, such as race and ethnicity, an intuitive sociological coarsening is not apparent. In these cases, we take the approach of grouping together the covariate’s smaller categories under the assumption that those users are somewhat similar. Alternatively we could use the covariate as is (without coarsening), which would discard many of those users from the matching, or we could remove the covariate from the matching method altogether. All these approaches make implicit assumptions, but in any case, given the small size of the categories the effect on our results should be small.
In terms of the actual matching procedure we use k2k matching with random selection. In k2k matching, within each stratum (combination of covariates) we randomly match a treatment and control sample (without replacement) until we exhaust either all treatment or all control samples121212In other words, the maximum number of matchings in a stratum is the minimum of the treatment and control samples sizes. The remaining non-matched treatment or control samples within the stratum are discarded.
An alternative approach is to randomly match treatment and control samples but to allow repeated matching of the same sample if the treatment and control sample sizes are different131313In other words, the maximum number of matchings in a stratum is the maximum of the treatment and control samples sizes. In this case, the subsequent matching will have weights for use in analysis. We also test with this alternative approach, however, the approach does not significantly change any of the analysis results. Therefore we only include the analysis results of the more straightforward k2k approach.
Finally, for illustration, Tables 8 and 9 detail difference measures for covariates between treatment and control groups for M1 before and after matching, respectively. Notice that the differences between the treatment and control groups are diminished, though not completely removed. Therefore, the covariates are included in the final regression models to help control for the remaining differences.
|Covariate||Coarsened Categories||Final Category|
|Marital Status||Married, Domestic Partnership||Married|
|Single, Divorced, Separated, Widowed||Not Married|
|Employment Status||Employed - full-time, Employed - part-time, Self-Employed||Employed|
|Homemaker, Unemployed and not looking for a job/Permanently Disabled/Housewife, Student, Currently Unemployed, Unemployed but looking for a job, Retired, Militarya, Don’t Know / Not Sure||Not Employed|
|HH Income (USD)||$15K <, [$15K,$20K), [$20K,$25K)||Low-Income|
|[$25K,$30K), [$30K,$40K), [$40K,$50K), [$50K,$75K)||Middle-Income|
|[$75K,$100K), [$100K,$150K), >$150K||High-Income|
|Other, Asian, Black||Non-White|
|Not Hispanic, Don’t Know / Not Sure||Non-Hispanic|
Military is counted as not employed simply because military members are typically excluded from the definition of civilian employment. In any case, the number of military users is very small.
|Covariate||Intervals for Discretization|
|Age (Years)||[18,25], [26,45], [46,82]|
|HH Size||, , [3,13]|
|Children in HH||, [1,9]|
|Display Size (Inches)||[2.8,3.5], (3.5,4.5], (4.5,5.5], (5.5,6.5]|
|Diplay Density (PPI)b||[132,200], (200,300], (300,400], (400,500], (500,800]|
|Display-to-Body Ratio (%)||[0.34,0.5], (0.5,0.6], (0.6,0.7], (0.7,0.82]|
|RAM (GB)||[0.15,0.6], (0.6,1], (1,2], (2,3], (3,4], (4,6]|
|Usage Timesa||6 equal width bins over range|
Specifically smartphone, PC, and tablet usage times.
Pixels per inch
|M1||All Users||Has Tablet||514|
|M2||Has PC||Has Tablet||178|
|M3||All Users||Has PC||1218|
|M4a||All Users||Has Tablet||548|
|M5b||All Users||Has PC||1234|
The difference between M1 and M4 is that M1 includes the PC usage time covariate in the matching while M4 does not include PC usage time because the dependant variable in H4 (total device usage) by definition includes PC usage time.
The difference between M3 and M5 is that M3 includes the tablet usage time covariate in the matching while M5 does not include tablet usage time because the dependant variable in H4 (total device usage) by definition includes tablet usage time.
In this section, we present and discuss the main results of our analysis.
4.1. Basic Descriptive Statistics
We first illustrate several basic statistics of the active dataset to help provide further context since many of the variables are non-normal. Figure 2
illustrates usage time distributions by device type as a violin plot. Interestingly, all three device types have quite different distribution shapes. Smartphone has the highest median usage time. However, PC has greater variation with a thicker and longer tail with significant outliers.
Additionally, Figure 3 details the Kendall correlations between all numerical and ordinal covariates. Interestingly, there are relatively few strong correlations between demographic and technographic covariates; though the moderately sized correlations that exist appear reasonable. For example, the positive correlation between HH income and display density suggests that high-income users generally have higher quality and more expensive devices (assuming that display density is a rough proxy for device quality and price141414For completeness, we calculate the correlation between display density and price (as reported by gsmarena.com) for the 144 unique device models of users from Finley et al. (2017). We find a strong positive Pearson correlation of 0.65.).
4.1.1. Hypotheses Testing
To test the hypotheses, for each hypothesis we perform linear regression analysis over a matching and against an dependent variable as specified in Table6. However, before regression analysis, we perform several diagnostics to ensure validity.
Regarding multicollinearity, highly correlated covariates in a regression model can cause problems including inflated standard errors. Thus we perform multicollinearity diagnostics on the covariates using thevif command of the car package (2.1-4) (Fox et al., 2016)
in R which reports generalized variance inflation factors (GVIF) for each covariate. First, we transform each GVIF to a factor comparable to non-generalized VIFs (denoted as a cVIF151515The GVIFs are transformed to cVIFs via the equation where is the GVIF and
is the degrees of freedom of the covariate(Fox, 2002). The degrees of freedom for a numeric covariate is one and for a categorical covariate is the number of categories.). We then use a step-wise elimination technique in which we remove the covariate with the highest cVIF over ten161616A VIF cutoff of ten is a widely used guideline originally proposed in Marquardt (1970). until no remaining covariates have cVIFs over ten. Through this technique, we remove the covariate household size in H2 and no covariates in H1, H3, H4 and H5.
In terms of linear regression assumptions, the data in each matching generally does not meet the assumptions of normality of errors or homoscedasticity. These assumption violations are due primarily to the presence of significant outliers. Therefore, we use robust linear regression that is resistant to outliers and the mentioned assumption violations in general. Specifically, we utilize the robust linear regression command lmrob of the robustbase package (0.92-7) (Rousseeuw et al., 2009) in R. The command builds a linear regression model through an MM-type regression estimator (Yohai, 1987; Koller and Stahel, 2011). This type of estimator has a high breakdown point of 50% and high asymptotic efficiency. Specifically, in our case, the full estimator chain is an S-estimate, M-estimate, D-estimate (Design Adaptive Scale (Koller and Stahel, 2011)), and another M-estimate as recommended in Koller and Stahel (2017).
The estimator achieves robustness partly by down-weighting severe outliers (the resulting weights are known as robustness weights). In our case, this down-weighting of severe outliers can be theoretically justified since several of our outliers are in any case potentially suspect. For example, for smartphones, the maximum observed usage time is implausibly high at about 30 days, indicating continuous usage (with the display on) for the entire month171717Comparatively, the 95% percentile of smartphone usage time is only about 9.5 days.. Clearly this measurement does not likely represent the actual usage time of a single user but instead might be a measurement error.
Finally, we perform the robust linear regression analyses. Table 6 details the results of each hypothesis testing including the estimated regression coefficient, statistical significance and whether these results support the hypothesis. Additionally, for illustration, Figures 4 and 5
detail the differences in distributions (kernel density estimates) that characterize each hypothesis test result.
For H1, we find that tablet ownership decreases smartphone usage by about 12.50 hours a month thus supporting H1 and the statement that tablets are a partial substitute for smartphones.
For H2, we find that tablet ownership does not significantly decrease PC usage thus refuting H2 and the statement that tablets are a partial substitute for PCs. Interestingly, this refutes the results in Finley et al. (2016, Section 5.2) which did not use a matching method. Furthermore, as we will show in the PC subtype testing in Section 4.1.3, this finding holds even when considering only laptop users (rather all PC users). Though overall, the sample size of the M2 matching (n=148) is still relatively small and, as we will discuss in Section 4.1.2, we are unable to perform device sharing testing for H2. Therefore, interpretations based on H2 results should still be performed with caution.
In terms of H3, we find surprisingly that PC ownership decreases smartphone usage by about 13 hours a month thus refuting H3 and the statement that PCs are not a partial substitute for smartphones. In other words, we find support for the statement that PCs are also a partial substitute for smartphones.
For H4 and H5, we find that both tablet and PC ownership significantly increases total device usage by about 20 and 57 hours a month respectively, therefore supporting H4 and H5 and the statements that tablet and PC ownership prompt additional (non-substituted) usage. The support for H4 backs up the findings of Müller et al. (2015) based on self-reported electronic diaries. Though, as we will discuss in Section 4.1.2, device sharing testing for H4 suggests that the additional (non-substituted) tablet usage could be the result of device sharing within households. Finally, the support for H5 reinforces the findings of Matthews et al. (2009) based on semi-structured interviews. The device sharing testing for H5 suggests that the additional (non-substituted) PC usage is not likely the result of device sharing within households. Additionally, the support for H5 also holds when considering laptop users and desktop users separately, though with desktop ownership increasing total device usage by 96 hours/month compared to 54 hours/month for laptop ownership.
Also in terms of diversity, the coefficient standard errors for H1, H3, and H4 are relatively large indicating substantial diversity in substituted and additional (non-substituted) usage between users. In other words, even though the coefficients are reasonable and descriptive point estimates, user diversity is also a part of the story. Similarly, user usage diversity has been observed extensively in previous mobile usage studies (Falaki et al., 2010; Böhmer et al., 2011; Soikkeli et al., 2013; Finley and Soikkeli, 2017; Hintze et al., 2017).
|Hypothesis||Matching||Depen. Variable||Coefficients (hours/month)a||Supported?|
|H1||M1||Smartphone Usage||-12.51 (4.99)||Yes|
|H2||M2||PC Usage||-0.92 (11.94)||No|
|H3||M3||Smartphone Usage||-13.02 (3.26)||No|
|H4||M4||Total Device Usage||19.58 (6.62)||Yes|
|H5||M5||Total Device Usage||57.43 (4.62)||Yes|
Regression coefficients include standard errors and significance levels ( : 5%, : 1%, : 0.1%).
4.1.2. Device Sharing Testing
As mentioned, device sharing between members of a household is a potential concern in the testing of H2, H4, and H5. Therefore we also test these hypotheses with only one person household users (thus lessening the possibility of shared devices). Tables 10 and 11 in the appendix detail the matching and results of this testing respectively.
We find no significant difference for H5. However, H4 does show a significant difference (specifically the tablet ownership variable is no longer significant). Therefore for H4 we cannot exclude the possibility that the observed additional (non-substituted) tablet usage is the result of device sharing. Though, we note that the matching sizes for these one person household tests are (naturally) quite a bit smaller. Finally, we are unable to test H2 with one person household users due to too small a sample size181818We note that we can apply the correlation approach (from Finley et al. (2016) and Section 4.1.4) to the subset. In this case the testing does not find statistically different results than the correlation approach with all households. (24 users).
4.1.3. PC Subtype (Desktop and Laptop) Testing
Additionally, we test the PC related hypotheses with the PC subtypes of desktop and laptop separately. Specifically, we test the hypotheses H2, H3, H5, with laptop-only users (69.15% of PC users) and H3, H5 with desktop-only users (20.29% of PC users). Unfortunately, we are unable to test H2 with desktop-only users due to too small a sample size (38 users). We exclude users with multiple PCs from the tests (7.6% of PC users). Tables 12 and 13 in the appendix detail the matching and results of this testing respectively. We find no significant difference with the exception that, as previously discussed, desktop ownership appears to prompt more additional (non-substituted) usage than laptop ownership.
4.1.4. Correlation Approach for H1-H3
For comparison purposes, we also calculate the correlation coefficients between device type usage times for different device type ownership groups (for example, the correlation between tablet and PC usage times for the group of users with both a tablet and PC). This approach is the same as used in Finley et al. (2016) (though Finley et al. (2016) used a smaller dataset). While the regression approach answers the question how does device type ownership affect the usage of another device type?, the correlation approach answers the related question how does device type usage affect the usage of another device type?.
The correlation coefficients are calculated based on a robust correlation coefficient known as the OP correlation (Wilcox, 2017)
. The OP correlation skips extreme outliers in the correlation calculation. The estimate of the OP correlation coefficient and significance level is performed through a percentile bootstrap method (with 1000 bootstraps) that is robust to heteroscedasticity(Wilcox, 2017). Table 7 details these correlations and their significance levels.
As expected the correlations support the conclusions (H1-H3) of the regression analysis. In comparison to Finley et al. (2016), we find a correlation between Tablet and PC usage of 0.01 compared to -0.24 in (Finley et al., 2016). In other words, we still find a significant difference between Finley et al. (2016) and the current study even when using the same approach (without matching). The exact reason for this difference is difficult to pinpoint, though the sample size in Finley et al. (2016) was only 52 users compared to 377 in this analysis.
|Hypothesis||User Subseta||Users||Corr Variablesa||Corrb||Supported?|
|H1||Has S and T||843||S and T Usage Times||-0.12 [-0.20 -0.04]||Yes|
|H2||Has T and PC||377||T and PC Usage Times||0.01 [-0.11 0.15]||No|
|H3||Has S and PC||2610||S and PC Usage Times||-0.12 [-0.16, -0.08]||No|
Overall, we find support for device type substitution between smartphone and both tablets and PCs.
Regarding smartphones and tablets, several theoretical factors support such substitution. For example, tablets generally run the same or only slightly modified smartphone applications, thus making substitution easier as the cost of learning a substitute program or alternative interface is small. Additionally, both smartphones and tablets are used extensively during the evening hours typically spent at home, thus providing the usage overlap needed for such substitution. However other factors inhibit substitution of certain usage. For example, smartphones smaller and more portable form factor compared to tablets naturally matches well to quick (often) informational glances known as micro-usage (Ferreira et al., 2014). Therefore this type of usage is unlikely to be substituted to a tablet. Additionally, such micro-usage is more concentrated during the mid-day hours when tablets are not as available (Ferreira et al., 2014).
In terms of smartphones and PCs, several of the factors supporting and inhibiting substitution between smartphone and tablets such as usage overlap and differences in form factor are similarly applicable.
Finally, the lack of substitution between tablets and PCs is harder to explain theoretically. As we note in Section 3.4, tablets and PCs may be shared among several members of a household, therefore substitution effects might be more difficult to identify because usage is entangled between more users. Furthermore, we were unable to test H2 with only one person household users due to too small a sample size. Therefore, further research including qualitative research should be performed to elucidate the relationship between tablet and PC usage and provide more context. Additionally, the initial panel recruitment survey could be adjusted to include questions about the extent of device sharing within a household.
With regard to additional (non-substituted) usage, we find support for such usage with both tablets and PCs. Though, we cannot rule out that the additional tablet usage is the result of device sharing within households. In both the tablet and PC cases, theory suggests that such additional activities might be prompted by the ability to use the unique advantages of each device type. For example, the advantage of a large tablet display over a small smartphone display might prompt additional video sessions beyond the normal sessions that would occur with a smartphone. Towards this end, larger displays have been shown to increase user immersion in videos and games (Rigby et al., 2016; Thompson et al., 2012) and user enjoyment in other tasks (Kim et al., 2011).
There are several limitations of the study that should be noted.
As previously discussed, generalizability is a potential concern and limitation of our study. Though we study an overall diverse user group (both demographically and technographically), we still acknowledge differences between our active dataset and the US smartphone population in general (as detailed in Table 1). Additionally, as mentioned, the lack of support for monitoring Apple macOS means that Apple PC users are not included in the active dataset. Therefore generalizations to this specific user group are limited. These limitations should be considered when drawing general conclusions.
Relatedly, our analysis is limited to only a single month. Therefore inter-month variability in user usage may affect certain quantitative results. Though assuming such variability is asynchronized between users, such effects should remain small191919Users are geographically dispersed across the US, thus negating local geographic synchronization.. Furthermore, given a moderate panel churn rate, longitudinal user analyses face a trade-off between sample size and analysis period length202020Though we note that Verto Analytics does use retention bonuses to try and help reduce churn.. In this study, we use a relatively large sample size and short analysis period, whereas future studies could analyze different combinations (i.e. smaller sample size but longer analysis period).
Finally, as mentioned, display-off usage (such as listening to music) is not included in the analysis. Therefore the results are limited to display-on usage substitution. Additionally, device type substitution where display-off usage (of one device) is substituted for display-on usage (of another device) or vice versa could affect device substitution estimates. Specifically, additional (non-substituted) usage would be overestimated and the substituted usage underestimated or vice versa. Though we estimate this effect would be small due to the dominance of display-on usage, further research is needed to clarify fully.
Additional limitations inherent to device-based monitoring studies, in general, are also applicable. For example, the analysis does not compensate for the time between when the user stops using the device and when the display turns off due to inactivity (assuming the user does not explicitly turn the display off). As discussed in Hintze et al. (2017), this is a limitation of all similar studies as typical devices cannot yet track user attention.
5.2. Implications and Future Work
Regarding result implications, we briefly discuss the implications for mobile advertisers and ad-driven application providers.
As a background, for many mobile advertisers and application providers the atomic unit of mobile ad inventory is the impression212121Impression inventory is often sold in units of 1000. An alternative inventory unit is the click where advertisers pay per click rather than per impression.. An impression is the displaying of an ad222222Often the ad is a banner at the top or bottom of the app. within an application for a typically short length of time232323The display time depends on the applications ad refresh rate.. Therefore a primary determinate of an application provider’s impression inventory is the total usage time for the application.
Thus, given the relationship between time usage and inventory, device type substitution patterns help in understanding inventory changes for different device types. For example, a further adoption of tablet devices (by smartphone users) might both increase tablet impression inventory at the expense of smartphone impression inventory and general additional tablet impression inventory. In other words, further adoption can both shift inventory and change the type of ad inventory (as smartphone and tablet ads are considered separate).
Beyond even these general understandings, the explicit quantification of such substitution can help in parameterizing future higher level research models such as a system dynamics model of the mobile content ecosystem. Such system dynamics models have been helpful in characterizing other parts of the mobile ecosystem such as digital service platform competition (Ruutu et al., 2017), mobile voice diffusion and competition (Casey and Töyli, 2012), and mobile application services (stores) (Wang et al., 2016). As Casey and Töyli (2012) even admit, a common reason for not performing quantitative system dynamic modeling is a lack of data (used for initial parameterization and relationship specification). We look to explore such modeling in future work.
Regarding additional future work, an application level analysis of substitution and additional usage similar to the application level analysis in Finley and Soikkeli (2017) would be a natural next step. The primary challenge in such an analysis is reconciling the device type and platform differences in application names, categories, etc. especially in the case of smartphone and tablet versus PC.
In this work, we provide estimates of device type substitution using device-based monitoring data and a robust paired matching method (to isolate the substitution effect). More specifically, the estimates allow the testing of five device type substitution hypotheses that span three different device types (smartphone, tablet, and PC). The results suggest that tablets and PCs are both partial substitutes for smartphones and yet also prompt significant additional (non-substituted) usage. Quantitatively, the estimates indicate that tablet ownership and PC ownership decrease smartphone usage by 12.5 and 13 hours/month, while prompting 20 and 57 hours/month of additional (non-substituted) usage respectively. Though we also find significant inter-user diversity in terms of these estimates. Finally, the results suggest that tablets are not a substitute for PCs despite the similarities between PCs (primarily laptops) and tablets in terms of portability and display size. Though this result is less robust and requires further study. Overall the results have implications, for example, for current mobile ecosystem players such as mobile advertisers and content providers and for parameterizing future research models.
Acknowledgements.The authors would like to thank Timo Smura, Heikki Hämmäinen, and Kalevi Kilkki for providing feedback on the manuscript.
- Arora et al. (2017) Sandeep Arora, Frenkel ter Hofstede, and Vijay Mahajan. 2017. The Implications of Offering Free Versions for the Performance of Paid Mobile Apps. Journal of Marketing 81, 6 (2017), 62–78. https://doi.org/10.1509/jm.15.0205
- Böhmer et al. (2011) Matthias Böhmer, Brent Hecht, Johannes Schöning, Antonio Krüger, and Gernot Bauer. 2011. Falling Asleep with Angry Birds, Facebook and Kindle: A Large Scale Study on Mobile Application Usage. In Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI ’11). ACM, New York, NY, USA, 47–56.
- Casey and Töyli (2012) Thomas R. Casey and Juuso Töyli. 2012. Mobile voice diffusion and service competition: A system dynamic analysis of regulatory policy. Telecommunications Policy 36, 3 (2012), 162–174.
- Church et al. (2015) Karen Church, Denzil Ferreira, Nikola Banovic, and Kent Lyons. 2015. Understanding the Challenges of Mobile Phone Usage Data. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI ’15). ACM, New York, NY, USA, 504–514.
- Cotten et al. (2014) Shelia R. Cotten, George Ford, Sherry Ford, and Timothy M. Hale. 2014. Internet Use and Depression Among Retired Older Adults in the United States: A Longitudinal Analysis. The Journals of Gerontology: Series B 69, 5 (2014), 763–771. https://doi.org/10.1093/geronb/gbu018
- de Reuver et al. (2012) Mark de Reuver, Harry Bowman, Nico Heerschap, and Hannu Verkasalo. 2012. Smartphone Measurement: Do People Use Mobile Applications as They Say They Do?. In International Conference on Mobile Business. Association for Information Systems, Atlanta, GA, USA, 1–10.
- Falaki et al. (2010) Hossein Falaki, Ratul Mahajan, Srikanth Kandula, Dimitrios Lymberopoulos, Ramesh Govindan, and Deborah Estrin. 2010. Diversity in Smartphone Usage. In Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services (MobiSys ’10). ACM, New York, NY, USA, 179–194.
- Ferreira et al. (2014) Denzil Ferreira, Jorge Goncalves, Vassilis Kostakos, Louise Barkhuus, and Anind K. Dey. 2014. Contextual Experience Sampling of Mobile Application Micro-usage. In Proceedings of the 16th International Conference on Human-computer Interaction with Mobile Devices & Services (MobileHCI ’14). ACM, New York, NY, USA, 91–100. https://doi.org/10.1145/2628363.2628367
- Finley et al. (2017) Benjamin Finley, Eren Boz, Kalevi Kilkki, Jukka Manner, Antti Oulasvirta, and Heikki Hämmäinen. 2017. Does network quality matter? A field study of mobile user satisfaction. Pervasive and Mobile Computing 39 (2017), 80–99.
- Finley and Soikkeli (2017) Benjamin Finley and Tapio Soikkeli. 2017. Multidevice mobile sessions: A first look. Pervasive and Mobile Computing 39 (2017), 267 – 283. https://doi.org/10.1016/j.pmcj.2016.11.001
- Finley et al. (2016) Benjamin Finley, Tapio Soikkeli, and Kalevi Kilkki. 2016. Mobile Application Usage Concentration in a Multidevice World. In Proceedings of the 13th International Joint Conference on e-Business and Telecommunications. Science and Technology Publications, Setubal, Portugal, 40–51. https://doi.org/10.5220/0005964000400051
- Fox (2002) John Fox. 2002. An R and S-Plus companion to applied regression. Sage, Thousand Oaks, CA, US.
- Fox et al. (2016) John Fox, Sanford Weisberg, Daniel Adler, Douglas Bates, Gabriel Baud-Bovy, Steve Ellison, David Firth, Michael Friendly, Gregor Gorjanc, Spencer Graves, et al. 2016. Package ‘car’. (2016). https://cran.r-project.org/web/packages/car/index.html
- GlobalWebIndex (2017) GlobalWebIndex. 2017. 60% of Tablet Users Sharing their Device. (2017). http://blog.globalwebindex.net/chart-of-the-day/60-tablet-users-sharing-device/
- Hays et al. (2015) Ron D. Hays, Honghu Liu, and Arie Kapteyn. 2015. Use of Internet panels to conduct surveys. Behavior Research Methods 47, 3 (2015), 685–690.
- Hintze et al. (2017) Daniel Hintze, Philipp Hintze, Rainhard D. Findling, and René Mayrhofer. 2017. A Large-Scale, Long-Term Analysis of Mobile Device Usage Characteristics. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2, Article 13 (jun 2017), 21 pages.
- Iacus et al. (2009) Stefano Iacus, Gary King, and Giuseppe Porro. 2009. cem: Software for Coarsened Exact Matching. Journal of Statistical Software, Articles 30, 9 (2009), 1–27. https://doi.org/10.18637/jss.v030.i09
- Iacus et al. (2012) Stefano M. Iacus, Gary King, and Giuseppe Porro. 2012. Causal Inference without Balance Checking: Coarsened Exact Matching. Political Analysis 20, 1 (2012), 1–24.
- Kim et al. (2011) Ki Joon Kim, S. Shyam Sundar, and Eunil Park. 2011. The Effects of Screen-size and Communication Modality on Psychology of Mobile Device Users. In CHI ’11 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’11). ACM, New York, NY, USA, 1207–1212. https://doi.org/10.1145/1979742.1979749
- King and Nielsen (2015) Gary King and Richard Nielsen. 2015. Why Propensity Scores Should Not Be Used for Matching. Technical Report. Cambridge, MA, USA.
- Koller and Stahel (2011) Manuel Koller and Werner A. Stahel. 2011. Sharpening wald-type inference in robust regression for small samples. Computational Statistics & Data Analysis 55, 8 (2011), 2504–2515.
- Koller and Stahel (2017) Manuel Koller and Werner A. Stahel. 2017. Nonsingular subsampling for regression S estimators with categorical predictors. Computational Statistics 32, 2 (2017), 631–646.
Donald W. Marquardt.
Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation.Technometrics 12, 3 (1970), 591–612.
- Matthews et al. (2009) Tara Matthews, Jeffrey Pierce, and John Tang. 2009. No smartphone is an island: The impact of places, situations, and other devices on smartphone use. Technical Report. San Jose, CA, USA.
- Müller et al. (2012) Hendrik Müller, Jennifer Gove, and John Webb. 2012. Understanding Tablet Use: A Multi-method Exploration. In Proceedings of the 14th International Conference on Human-computer Interaction with Mobile Devices and Services (MobileHCI ’12). ACM, New York, NY, USA, 1–10.
- Müller et al. (2015) Hendrik Müller, Jennifer L. Gove, John S. Webb, and Aaron Cheang. 2015. Understanding and Comparing Smartphone and Tablet Use: Insights from a Large-Scale Diary Study. In Proceedings of the Annual Meeting of the Australian Special Interest Group for Computer Human Interaction (OzCHI ’15). ACM, New York, NY, USA, 427–436.
- Pew Internet and American Life Project (2016) Pew Internet and American Life Project. 2016. Sept. 29-Nov. 6, 2016 – Information Engaged and Information Wary. (2016). http://www.pewinternet.org/dataset/sept-29-nov-6-2016-information-engaged-and-information-wary/
- Pew Research Center (2017) Pew Research Center. 2017. Our survey methodology in detail. (2017). http://www.pewresearch.org/methodology/u-s-survey-research/our-survey-methodology-in-detail/#data-weighting
- Rigby et al. (2016) Jacob M. Rigby, Duncan P. Brumby, Anna L. Cox, and Sandy J.J. Gould. 2016. Watching Movies on Netflix: Investigating the Effect of Screen Size on Viewer Immersion. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct (MobileHCI ’16). ACM, New York, NY, USA, 714–721. http://doi.acm.org/10.1145/2957265.2961843
- Rousseeuw et al. (2009) PJ Rousseeuw, Christophe Croux, Valentin Todorov, Andreas Ruckstuhl, Matias Salibian-Barrera, Tobias Verbeke, and M. Maechler. 2009. Robustbase: basic robust statistics. (2009). http://CRAN.R-project.org/package=robustbase
- Ruutu et al. (2017) Sampsa Ruutu, Thomas Casey, and Ville Kotovirta. 2017. Development and competition of digital service platforms: A system dynamics approach. Technological Forecasting and Social Change 117 (2017), 119–130.
- Shmorgun et al. (2013) Ilya Shmorgun, David Lamas, and Mattias Saks. 2013. A Sample of Technology Substitution. In Proceedings of the International Conference on Multimedia, Interaction, Design and Innovation (MIDI ’13). ACM, New York, NY, USA, Article 15, 7 pages.
- Soikkeli et al. (2013) Tapio Soikkeli, Juuso Karikoski, and Heikki Hammainen. 2013. Characterizing Smartphone Usage: Diversity and End User Context. International Journal of Handheld Computing Research 4, 1 (2013), 15–36. https://doi.org/10.4018/jhcr.2013010102
- Thompson et al. (2012) Matt Thompson, A. Imran Nordin, and Paul Cairns. 2012. Effect of Touch-screen Size on Game Immersion. In Proceedings of the 26th Annual BCS Interaction Specialist Group Conference on People and Computers (BCS-HCI ’12). British Computer Society, Swinton, UK, UK, 280–285. http://dl.acm.org/citation.cfm?id=2377916.2377952
- Wang et al. (2016) Juite Wang, Jung-Yu Lai, and Chih-Hsin Chang. 2016. Modeling and analysis for mobile application services: The perspective of mobile network operators. Technological Forecasting and Social Change 111 (2016), 146–163.
- Wen and Zhu (2017) Wen Wen and Feng Zhu. 2017. Threat of Platform-Owner Entry and Complementor Responses: Evidence from the Mobile App Market. Technical Report.
- Wilcox (2017) Rand Wilcox. 2017. Introduction to Robust Estimation and Hypothesis Testing (4 ed.). Elsevier, Amsterdam, Netherlands.
- Yohai (1987) Victor J Yohai. 1987. High breakdown-point and high efficiency robust estimates for regression. The Annals of Statistics 15, 2 (1987), 642–656.
Appendix A Appendix
|PC Usage Time||10.56||(Con)||0.00||0.00||0.00||1.77||28.95||341.72|
|Children in HH||0.05||(Con)||0.01||0.00||0.00||0.00||1.00||3.00|
For continuous (Con) covariates the statistic is the difference in means, whereas for categorical (Cat) type covariates the statistic is the Chi2 test value.
|PC Usage Time||0.88||(Con)||0.00||0.00||0.00||0.00||6.96||-36.00|
|Children in HH||0.08||(Con)||0.04||0.00||0.00||0.00||0.00||0.00|
For continuous (Con) covariates the statistic is the difference in means, whereas for categorical (Cat) type covariates the statistic is the Chi2 test value.
|M6||One Person HH and Has PC||Has Tablet||24|
|M7||One Person HH||Has Tablet||84|
|M8||One Person HH||Has PC||214|
|Hypothesis||Matching||Depen. Variable||Coef. (hours/month)a||Supported?|
|H4||M7||Total Device Usage||18.75 (19.96)||No|
|H5||M8||Total Device Usage||52.21 (10.39)||Yes|
Regression coefficients include standard errors and significance levels ( : 5%, : 1%, : 0.1%).
The matching size is too small for analysis.
|M9||Has PC (Laptop)||Has Tablet||104|
|M10||All Users||Has PC (Laptop)||996|
|M11||All Users||Has PC (Laptop)||1006|
|M12||Has PC (Desktop)||Has Tablet||38|
|M13||All Users||Has PC (Desktop)||390|
|M14||All Users||Has PC (Desktop)||396|
|Hypothesis||Matching||Depen. Variable||Coef. (hours/month)a||Supported?|
|H2||M9||PC Usage (Laptop)||3.95 (18.18)||No|
|H3||M10||Smartphone Usage||-16.30 (3.71)||No|
|H5||M11||Total Device Usage||54.00 (5.03)||Yes|
|H2||M12||PC Usage (Desktop)||-b||-|
|H3||M13||Smartphone Usage||-15.94 (5.43)||No|
|H5||M14||Total Device Usage||95.89 (9.58)||Yes|
Regression coefficients include standard errors and significance levels ( : 5%, : 1%, : 0.1%).
The matching size is too small for analysis.