A Closer Look into Mobile Network Speed Measurements

10/21/2017 ∙ by Cise Midoglu, et al. ∙ Simula Research Lab 0

As the demand for mobile connectivity continues to grow, there is a strong need to quantify the performance of Mobile Broadband (MBB) networks. In the last years, mobile speed gained popularity as the most commonly accepted metric to describe performance. However, there is no consensus on how exactly to measure mobile speed. In this paper, we explore the concept of crowdsourced performance measurements in MBB networks. First, we analyse the methodology of a number of representative tools, such as Ookla Speedtest, OpenSignal, RTR-Nettest, MobiPerf, and investigate whether these tools essentially measure the same metrics. Second, we introduce MONROE-Nettest, a configurable tool for mobile speed measurements which can be adapted to run different measurement methodologies. Through the analysis of MONROE-Nettest measurements over commercial mobile networks, which we provide as an open dataset, we identify and quantify the key factors affecting the results of mobile speed measurements.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The use of mobile networks has exploded over the last few years due to the immense popularity of powerful mobile devices combined with the availability of high-capacity 3G/4G mobile networks. Forecasts indicate that global mobile data traffic will increase sevenfold between 2016 and 2021, reaching EB per month by 2021 and summing up to a total of of all IP traffic [1]. This reflects the insatiability of mobile data consumers and the rapid boost in demand for faster mobile connectivity. Given the increasing importance of Mobile Broadband (MBB) networks and the expected growth in mobile traffic, there is a strong need for better understanding the fundamental characteristics of MBB networks and the services that they can provide.

Ensuring a smooth mobile experience for customers is key to business success for Mobile Network Operators and mobile “speed” is a foolproof marketing resource (i.e., it is straightforward for any customer to understand that, when it comes to mobile connectivity, higher speeds are better). Consequently, in the recent years, the term mobile speed has become the major indicator of the MNO’s performance. For instance, the broadband testing company Ookla gives out Fastest Broadband and Mobile Network Awards every year, where they use a “Speed Score” calculated using Downlink (DL) data rate and Uplink (UL) data rate [2]. Similarly, OpenSignal compiles yearly reports on the state of mobile networks around the world with award tables, where average DL data rate is used as the “speed” indicator [3]. The rankings and reports from such performance monitoring entities steer the public opinion in the MBB market and impact customer behavior, since the simple concept of speed resonates well with end-users.

Despite the apparent simplicity of the term, there is a lack of consensus about how to measure mobile speed (e.g. DL and UL data rates) accurately and consistently. For example, all of the commercial mobile speed measurement tools (e.g. Speedtest and OpenSignal) are proprietary and the measurement methodologies are not open (i.e. only high-level explanations in the tool descriptions and FAQ pages are provided). This lack of transparency can result in controversy, as in the case of Ookla designating Airtel as India’s fastest 4G network in 2016, which was a move questioned by another operator Reliance Jio and subsequently caused a complaint made to the Advertising Standards Council of India (ASCI[4]. To eliminate this controversy as well as verify that the services paid for by the end-users are actually provided by their MNOs, National Regulatory Authoritys have come up with their own crowdsourced solutions. Notable examples of regulatory tools include the FCC Speed Test [5] in the US, RTR-Nettest in Austria [6, 7], and similar tools that are based on the RTR Multithreaded Broadband Test (RMBT) in other countries such as Croatia, Czech Republic, Slovakia, Slovenia, and Norway. Despite their fundamental similarities, different tools use different parameter sets during measurement (e.g. measurement duration, number of Transmission Control Protocol (TCP) flows, and server location) [8], leading to significant differences in their reported results[9]. Therefore, it is important to understand how different parameters influence mobile speed measurements.

In this paper, we design and implement an open source, configurable measurement tool, MONROE-Nettest, that integrates the measurement functionality of existing tools to understand how different parameters influence mobile speed measurements. MONROE-Nettest is built as an Experiment as a Service (EaaS) on top of the Measuring Mobile Broadband Networks in Europe (MONROE)111https://www.monroe-project.eu/ platform, Europe’s first and only open dedicated platform for experimentation in operational MBB networks. MONROE makes it possible to conduct a wide range of repeatable measurements in the same location, using the same devices/modems, for multiple operators at the same time, and from an end-user perspective. Therefore, MONROE-Nettest enables the empirical analysis of mobile speed measurements through large-scale experimentation in operational MBB networks. We demonstrate a use case of MONROE-Nettest by quantifying the effects of several parameters on DL data rate through a systematic large-scale campaign over 6 operational mobile networks in Europe. Our results show that differences in configuration can significantly affect measurement results, and that datasets from different tools need to be baselined in a controlled framework for comparability.

2 Background and Related Work

User Datagram Protocol (UDP)-based measurements are widely used to explore the available capacity for a given network [10]. In our previous work, we presented such an end-to-end active measurement method [11]. However, TCP is more representative of the share of the bottleneck capacity that the end users experience, especially with short-lived application traffic which is most common in today’s Internet. Therefore, speed measurement tools adopt such a user oriented approach for measuring data rate and they use TCP-based testing with multiple parallel flows.

Speed measurements have been well studied in broadband networks [12, 13] where a comparison of different methodologies is provided for browser-based testing. However, many concepts that are already known in fixed broadband networks are harder to track, evaluate, and more importantly, quantify in MBB networks [9]. Therefore, it is important to revisit speed measurements considering the complexity of MBB networks. When it comes to mobile speed measurements, crowdsourced tools such as Network Diagnostic Tool (NDT[14], MobiPerf [15] and Netalyzr [16, 17] are commonly used to collect measurements from a large number of MBB users. A recent survey provides a comprehensive summary of crowdsourced tools that are intended for end-user measurements of mobile networks through smartphone applications [8]. However, despite its breadth and relative depth, it does not provide an empirical comparative analysis of the different measurement methodologies.

While crowdsourced approaches provide a large number of vantage points spread throughout different regions and access to numerous networks and user equipment, repeatability is challenging to achieve and one can only collect measurement data at users’ own will. Dedicated infrastructures, on the other hand, can provide active, systematic measurements that can be run at regular intervals over long time periods. Moreover, they allow fair assessment of each network by following the same measurement methodology and running the measurements under similar conditions (e.g., same device, same operating system, same environment). There are many dedicated testbeds for measuring broadband networks, such as PlanetLab [18], GENI [19] and Fed4FIRE [20], However, support for MBB is limited to Fed4FIRE, where none of its associated testbeds such as NITOS [21] and WiLAB [22], support experimentation in operational MBB networks. To this end, MONROE [23, 24, 25] is the only dedicated testbed that enables large-scale end-to-end measurements in operational MBB networks. In this paper, building on the MONROE platform, we design and implement a configurable tool that integrates the measurement functionalities of existing crowdsourced applications. MONROE-Nettest enables the empirical analysis of mobile speed measurements through large-scale experimentation in operational MBB networks. The tool is provided as EaaS and allows for any interested party to investigate different aspects of speed measurements using MONROE.

3 Tool Design and Implementation

In this section, we first provide an overview of the MONROE platform, and then detail the design and implementation of MONROE-Nettest.

3.1 Monroe Platform

MONROE[24] is a European transnational open platform, and the first open access hardware-based platform for independent, multi-homed, and large-scale MBB measurements on commercial networks. The platform comprises a set of 150 nodes, both mobile (e.g., operating in delivery trucks and on board public transport vehicles, such as trains or busses) and stationary (e.g., volunteers hosting nodes in their homes). Nodes are multi-homed to 3 different MBB operators using commercial-grade subscriptions in several countries in Europe (Italy, Norway, Spain, Sweden, Portugal, Greece and UK).

Each MONROE node integrates two small programmable computers (PC Engines APU2 board interfacing with three 3G/4G MC7455 miniPCI express modems using LTE CAT6 and one WiFi modem). The software on the node is based on Debian GNU/Linux “stretch” distribution and each node collects metadata from the modems, such as carrier, technology, signal strength, GPS location and sensor data. This information is made available to the experimenters during execution. Experiments running on the platform uses Docker containers (light-weight virtualized environment) to provide agile reconfiguration.

User access to the platform resources is through a web portal, which allows an authenticated user to use the MONROE scheduler to deploy experiments. This enables exclusive access to nodes (i.e., no two experiments run on the node at the same time). The results from each experiment are periodically transferred from the nodes to a repository at a back-end server, while the MONROE scheduler also sets data quotas to ensure fairness among users.

MONROE allows us to eliminate the noise from measurements, by providing a controlled environment to conduct repeatable and reproducible experiments. In contrast to crowdsourced tools which cannot produce datasets for performance characterization of MNOs due to the amount of noise in their results or because of app permission requirements [16], MONROE provides a clean dataset collected from identical devices that require no maintenance on the part of the end user.

All software components used in the platform are open source and available online.222https://github.com/MONROE-PROJECT/

3.2 MONROE-Nettest Client

Client Core

We choose RMBT[26] as the codebase for our implementation since it is used by a number of NRAs in Europe for their crowdsourced measurement applications (see Section 1). However, the current RMBT core is in Java, as this is the primary programming language for Android smartphones. Our goal is to provide a light-weight implementation which can also be run on devices with relatively low resources. Using a lower-level language such as C enables this, while at the same time allowing direct access to socket functions provided by the standard library. C also enables cross-platform compatibility, which allows our tool to be adaptable to other testbeds with minimal effort. Our client core is open-source, and can also be used as a native library in different tools or mobile applications.333https://github.com/lwimmer/rmbt-client

(a) Pre-test DL
(b) Ping
(c) DL
(d) Pre-test UL
(e) UL
Fig. 1: MSCs for the different Nettest phases.

We follow the RMBT measurement flow. Figure 1 illustrates the Message Sequence Charts for different measurement phases. In the pre-test DL phase, for each flow, the client requests data in the form of chunks that double in size for each iteration. In the ping phase, the client sends small TCP packets (ASCII PING), to which the server replies with an ASCII PONG. The number of pings are configurable. In the DL phase, for each flow, the client continuously requests and the server continuously sends data streams consisting of fixed-size packets (chunks). Pre-test UL phase is similar to the pre-test DL phase, except that the client sends chunks that double in size for each iteration. In the UL phase, for each flow, the client continuously sends data and the server continuously receives. For each chunk received, the server sends a timestamp indicating when it received the chunk.

In addition to the original RMBT implementation, MONROE-Nettest samples high resolution time series of data rate (per chunk). It also collects TCP-related information for each socket, such as retransmissions, Round-Trip Time (RTT), slow-start threshold, and window sizes using the TCP_INFO socket option from the linux kernel (the granularity can be specified as input).

Configurable parameters of the client include the number of flows for DL and UL, measurement durations for DL, UL and the pre-tests, and the measurement server. The full list of configuration parameters and default values can be found in the experiment repository.444https://github.com/MONROE-PROJECT/Experiments/tree/master/experiments/nettest

For calculating the data rate, the client uses an aggregation of all flows, with a granularity of one data chunk (which is also a configurable parameter). Let be the number of TCP flows used for the measurement and be the set of these flows. All transmissions start at the same time, which is denoted as relative time . For each TCP flow , the client records the relative time and the total amount of data received in Bytes on this flow (per chunk), from time 0 to for successive values of , starting with for the first chunk received. For each TCP flow, the time series begins with and , where is the number of pairs which have been recorded for flow .

(1)
(2)

being the index of the chunk received on flow at or right after . Then the amount of data received over TCP flow from time 0 to time is approximately

(3)

The data rate for all TCP flows combined is given by Eq.4, where is used as the final reported data rate.

(4)

One remark here is that we can get both the application data rate (e.g. including SSL overhead if enabled), and the transport capacity using the MONROE-Nettest. The former is provided by the Nettest core using Equation 4 for each measurement, where the latter can be calculated using the detailed TCP_INFO output. Another important remark in connection with the slow-start phase of TCP is that, in our current implementation, we are including this phase in the calculation. However, it is known that some of the existing tools cut out the TCP

slow start, which yields a more optimistic data rate estimate.

Experiment Container

MONROE-Nettest is a wrapper around the client core explained in the previous segment, making it available for testbed experimentation. It runs as a Docker container4, and combines a number of functionalities. First, it establishes that metadata information is available (if the experiment is run on MONROE nodes), then it runs a traceroute against the selected measurement server, after which it runs the client core.

The MONROE-Nettest container produces output files: summary.json presents the result fields such as calculated DL and UL data rate, and the median TCP payload RTT. It also includes configuration-related fields such as the test identifier, basic input parameters, main timestamps, status, server and connection information, and metadata. flows.json includes the raw time series for each TCP flow, which are sustained through all stages (initialization, pre-test DL, ping, DL, pre-test UL, UL). stats.json includes samples of the socket option TCP_INFO for each TCP flow, in the given granularity. traceroute.json includes the results from the traceroute towards the selected measurement server. If the Autonomous System Number (ASN) per-hop is not available from the traceroute, this information is added via a lookup.

The advantage of running MONROE-Nettest over running the client core directly is the possibility to use the multi-config option, which enables sets of experiments with different configuration parameters to be executed at once.

The MONROE-Nettest container can be run within or outside the scope of MONROE.

Scheduler Template

As mentioned in Section 3.1, MONROE provides a scheduler with a web interface for external experimenters.555https://www.monroe-system.eu/, instructions on how to use the scheduler for certified users can be found in the user manual. A template is prepared to run MONROE-Nettest from this web interface with a single click, using the default parameters. The configuration parameters can be modified at will.

3.3 MONROE-Nettest Server

In order to keep compatibility with the RMBT, we use the server code from the open-source Open-RMBT project[27]. Our only change is to disable the token check so that the measurement server can be used without previous token/secret exchange. Therefore, the MONROE-Nettest container can be run against any existing RMBT server whose token check is disabled.

The MONROE testbed readily provides servers in Germany, Norway, Spain, and Sweden. Further, our source code is open so that any institution can host a MONROE-Nettest server at their premises.666Instructions on how to set up a MONROE-Nettest server can be found under4.

4 Measurement Results and Evaluation

In this section, we focus on the DL data rate feature of MONROE-Nettest and demonstrate the capabilities of our tool by quantifying the impact of different parameters on DL data rate for operational MBB networks.

ID Batches Operator Clients Servers Flows Duration
1 3 NO, 3 SE 10 NO, 10 SE 1 NO, 1 SE 1,3,5,7
2 3 NO, 3 SE 2 NO, 2 SE 1 NO, 1 SE 1,3,4,5,7,9
3 2 NO 1 NO 1 SE, 1 DE 3,5,7,9
TABLE I: MONROE-Nettest measurement campaigns.

4.1 Measurement Setup

For our measurement campaign, we leverage stationary MONROE nodes and opt out of using mobile MONROE nodes, in order to eliminate any mobility related impact on our results. Considering the fact that mobile subscriptions have limited data quotas and running active measurements under multiple configurations for a relatively long duration of time consumes large data volumes, we focus on Scandinavian countries, where the SIM quotas are relatively higher. More specifically, we use nodes distributed in Oslo, Norway and nodes distributed in Karlstad, Sweden. We measure a total of different commercial networks in 4G. We run measurements against servers in Norway (Oslo), Sweden (Karlstad), and Germany (Falkenstein). Table I lists our measurement campaigns. Campaign focuses on a smaller number of flows and servers, with extended vantage points to eliminate any location specific artifacts on the results, whereas Campaigns and jointly provide a deeper analysis of the number of flows and server location, with less vantage points, all using measurement duration. Overall, we run more than batches, corresponding to individual experiment runs.

4.2 Number of Flows and Measurement Duration

First, we investigate the effect of number of flows and measurement duration on reported DL data rate. Figure 2 shows reported median DL data rate vs. measurement duration for different number of flows, from Campaign . For each run, we first compute the reported data rate using Equation 1 for every between - as . We then used the median of these values, per , to plot the curves in the figure. We observe a general trend of increasing reported data rate with increased number of flows (Figure (a)a). The trend is also persistent in the smaller scale, when we only consider nodes from a single operator against a measurement server in the same country (Figure (b)b-(c)c). We further observe that the maximum supported DL data rate of a network (as an artifact of operator coverage and provisioning) impacts the spread between different number of flows.

(a) Campaign 1, overall.
(b) Campaign 1, op1-SE.
(c) Campaign 1, op2-SE.
Fig. 2: Effect of number of flows and measurement duration.

Next, we try to quantify these effects using Campaign , which includes different number of flows: , , , , , . We evaluate the similarity of time series using Euclidean distance, which has been shown to be applicable in this context [28]. Table II presents the percentage of the median data rate at (assumed to be the saturation data rate) that the curves capture at , , , , and . We observe that while with flows, it takes seconds to reach more than 92% of the saturation data rate, it takes seconds to reach the percentage with flows. Note that there is a trade-off between the accuracy in estimation, which increases with the number of flows, and data consumption, which increases with the time spent by each flow at a “saturated” state. Therefore, to avoid consuming unnecessary data quota while accurately capturing the data rate, there is a need for optimizing the number of flows and test duration. The optimal value of these parameters, however, might vary according to operator, technology, and coverage.

% of Saturation Data Rate
Flows 2s 4s 6s 8s 10s
1
3
4
5
7
9
TABLE II: Quantifying the effect of number of flows and measurement duration.

We further observe daily data rate patterns for some operators, indicating that this spread might not be constant throughout a day. Figure 3 shows the scatter plot of data rate vs. relative time for a Norwegian operator (op4), for the first hours of Campaign . There is a clear trend for all values of the number of flows above . This confirms that the optimal value for the number of flows and test duration needs to take also the daily patterns into account.

Fig. 3: Daily patterns in reported data rate.

4.3 Server Location

Next, we investigate the effect of server location by comparing measurements against servers in different countries, using Campaigns and . Table III presents the Euclidean distance between time series of reported median DL data rate measured by clients in Norway using a Norwegian SIM (op4) against servers in different countries (Germany vs. Sweden, and Norway vs. Sweden), for different number of flows (,,,). The values are normalized according to the median value from the Sweden curve (results against the server in Sweden).

Flows Germany - Sweden Norway - Sweden
3
5
7
9
TABLE III: Euclidean distance between time series.

We see a general trend of decreasing data rate with increasing (geographical) distance from the server. Clients in Norway record the highest data rates against the measurement server in Norway, followed by Sweden and then Germany (preserving the order of the number of flows established before). This confirms that server location has a significant effect on measurement results, and having a measurement server as close to the vantage point as possible is of great importance, in order to capture the client-experienced achievable data rate.

5 Conclusions and Future Work

As data rate builds into a critical selling point in the MBB market, there is also a corresponding controversy and an increased interest surrounding it. There is a plethora of entities which implement measurement tools targeted at end-users, which are essentially alternatives to one another, but cannot be reliably and repeatably compared.

In this study, we focus on TCP-based active measurements and provide a configurable tool, MONROE-Nettest within an EaaS framework on the MONROE platform. The MONROE-Nettest is designed for compatibility with the largest portion of the available applications and is able to mimic different measurement methodologies. We provide the first testbed integration of a completely open-source and configurable speed measurement tool, with rich metadata and context information for repeatable large scale measurements in operational MBB networks. Such a tool, we believe, is instrumental for the standardization of broadband measurement methods.

Our service can be used to baseline measurements which have been conducted with different sets of parameters. This combines the benefits of crowdsourcing with controlled experimentation, by allowing existing datasets to be jointly used. To the best of our knowledge, our work is also the first attempt to quantify the effects of different parameters such as number of TCP flows, measurement duration, and server location on DL data rate measurements in operational MBB networks, where detailed results are also provided as open data.

As future work, we plan to develop algorithms that can reduce the data volume consumption while still providing an accurate speed estimate, by adaptively selecting the measurement duration according to network properties. We also plan to study the time series of data rate jointly with TCP metrics and traceroute results, in order to localize bottlenecks in the end-to-end path, and to differentiate whether the observed data rate is limited by the operator’s network or within the Internet.

6 Live Demo

A live demonstration of the MONROE-Nettest will be provided during the workshop, where we will run our tool with different configuration parameters in operational MBB networks and illustrate the results in near real-time.

References

  • [1] “Cisco visual networking index: Global mobile data traffic forecast update, 2016 - 2021.” Cisco Systems Inc., Tech. Rep., 2017.
  • [2] http://www.speedtest.net/awards.
  • [3] https://opensignal.com/reports/.
  • [4] https://in.news.yahoo.com/airtel-vs-reliance-jio-vs-105333550.html.
  • [5] https://www.fcc.gov/general/measuring-broadband-america.
  • [6] https://www.netztest.at/.
  • [7] C. Midoglu et al., “Opportunities and challenges of using crowdsourced measurements for mobile network benchmarking - a case study on RTR Open Data,” SAI Computing, 2016.
  • [8] U. Goel et al., “Survey of end-to-end mobile network measurement testbeds, tools, and services,” IEEE Communication Surveys & Tutorials, vol. 18, no. 1, pp. 105–123, 2016.
  • [9] A. S. Khatouni et al., “Speedtest-like measurements in 3G/4G networks: the MONROE experience,” ITC, 2017.
  • [10] S. Ferlin et al., “Measuring the QoS characteristics of operational 3G mobile broadband networks,” 28th International Conference on Advanced Information Networking and Applications Workshops, 2014.
  • [11] M. K. Bideh et al., “Tada: An active measurement tool for automatic detection of AQM,” VALUETOOLS, 2015.
  • [12] S. Bauer et al., “Understanding broadband speed measurements,” TPRC, 2010.
  • [13] I. Canadi et al., “Revisiting broadband performance,” IMC, 2012.
  • [14] https://www.measurementlab.net/tests/ndt/.
  • [15] A. Nikravesh et al., “Mobile network performance from user devices: A longitudinal, multidimensional analysis,” PAM, 2014.
  • [16] C. Kreiblich et al., “Netalyzr: Illuminating the edge network,” IMC, 2010.
  • [17] ——, “Experiences from netalyzr with engaging users in end-system measurement,” SIGCOMM W-MUST, 2011.
  • [18] B. Chun et al., “PlanetLab: an overlay testbed for broad-coverage services,” Computer Communication Review, 2003.
  • [19] M. Berman et al., “GENI: A federated testbed for innovative network experiments,” Computer Networks, vol. 61, pp. 5–23, March 2014.
  • [20] https://www.fed4fire.eu/.
  • [21] http://nitlab.inf.uth.gr/nitlab/nitos.
  • [22] http://wilab2.ilabt.iminds.be/.
  • [23] Ö. Alay et al., “Measuring and assessing mobile broadband networks with MONROE,” IEEE WoWMoM, pp. 1–3, 2016.
  • [24] ——, “Experience: An open platform for experimentation with commercial mobile broadband networks,” MobiCom ’17, 2017.
  • [25] M. Peon-Quiros et al., “Results from running an experiment as a service platform for mobile networks,” MobiCom WiNTECH, 2017.
  • [26] https://www.netztest.at/doc/.
  • [27] https://github.com/alladin-it/open-rmbt/tree/master/rmbtserver.
  • [28] J. Serra et al., “An empirical evaluation of similarity measures for time series classification,” Knowledge-Based Systems, vol. 67, pp. 305–314, September 2014.