Challenges in Net Neutrality Violation Detection: A Case Study of Wehe Tool

by   Vinod S. Khandkar, et al.
IIT Bombay

The debate on "Net-neutrality" and events pointing towards its possible violations have led to the development of tools to detect deliberate traffic discrimination on the Internet. Given the complex nature of the Internet, neutrality violations are not easy to detect, and tools developed so far suffer from various limitations. In this paper, we study many challenges in detecting the violations and discuss possible approaches to mitigate them. As a case study, we focus on the tool Wehe <cit.> and discuss its limitations and propose the aspects that need to be strengthened. Wehe is the most recent tool to detect neutrality violations. Despite Wehe's vast utility and possible influences over policy decisions, its mechanisms are not yet fully validated by researchers other than original tool developers. We seek to fill this gap by conducting a thorough and in-depth validation of Wehe. Our validation uses the Wehe App, a client-server setup mimicking Wehe's behavior and its theoretical arguments. We validated the Wehe app for its methodology, traffic discrimination detection, and operational environments. We found that the critical weaknesses of the Wehe App are due to its design choices of using port number 80, overlooking the effect of background traffic, and the direct performance comparison.




FairNet: A Measurement Framework for Traffic Discrimination Detection on the Internet

Network neutrality is related to the non-discriminatory treatment of pac...

Are Free Android App Security Analysis Tools Effective in Detecting Known Vulnerabilities?

Increasing interest to secure Android ecosystem has spawned numerous eff...

Appsent A Tool That Analyzes App Reviews

Enterprises are always on the lookout for tools that analyze end-users p...

Deep Learning for Encrypted Traffic Classification and Unknown Data Detection

Despite the widespread use of encryption techniques to provide confident...

The Historical Perspective of Botnet tools

Bot as it is popularly called is an inherent attributes of botnet tool. ...

Analyzing Multiagent Interactions in Traffic Scenes via Topological Braids

We focus on the problem of analyzing multiagent interactions in traffic ...

augKlimb: Interactive Data-Led Augmentation of Bouldering Training

Climbing is a popular and growing sport, especially indoors, where climb...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Net neutrality is a guiding principle promoting the “equal” treatment of all packets over the Internet. But the practical implementation of this principle requires relaxations, such as “reasonable traffic management”. Traffic management can benefit all services allowing ISPs to attain efficient network operations as a whole. However, traffic management differs from preferential treatment or throttling (or traffic differentiation) as the latter does not necessarily improve the network’s overall efficiency. ISP applies such traffic differentiation (TD) to a specific service, user, ISP, or any other traffic group on the Internet without making any public declaration. It gives rise to a need to have tools that can detect such malicious activities over the Internet.

Traffic differentiation detection involves the coalescence of many elements. It needs to generate probing traffic as per the expected network responses in case of active probing. The network responses are a very crucial part of the tool as it governs the TD detection capability. The TD detection algorithm also needs special attention from specific real-world scenarios, such as the time-varying effect of background traffic on the probing traffic performances. Finally, the operational environment plays a role in the successful deployment of any tool. The network configuration, e.g., NAT enabled network is one such important aspect. Moreover, measurement setups involving passive monitoring need to normalize the effect of the factors mentioned above as it does not have direct control over it.

These are interdependent components or operations. Its design choices affect the user-client or server (if applicable) and alter the expected network response and consecutively TD detection algorithm. Hence researchers developing TD detection tools face challenges from crafting internet traffic to conditioning measured network response that suits their detection algorithm while developing a new tool for traffic differentiation detection and validating or incorporating any existing tool. We seek to study the various challenges associated with designing these interdependent components or operations for reliable TD detection.

The developers of the traffic differentiation detection tool always validate their tool. Moreover, the proposal for a new traffic differentiation detection tool sometimes contains the validation of existing traffic differentiation tools. For example, [16] includes the Glasnost tool’s [5] traffic differentiation detection algorithm validation for its detection threshold. Such verification is partial due to limited emphasis on validating other tools than describing the proposed tool. Moreover, developers’ validation becomes obsolete in many cases due to advances in the underlying technologies like networking. We seek to demonstrate the application of our study for conducting such validations of the TD detection tools.

We take the “Wehe” tool as a case study. The user database of the Wehe tool consists of 126,249 users across 2,735 ISPs in 183 countries/regions generating 1,045,413 crowdsourced measurements. European national telecom regulator, the US FTC and FCC, US Senators, and numerous US state legislators have used the Wehe tool’s findings. Despite the Wehe tool’s vast utility and possible influences over policy decisions, its mechanisms are not yet fully validated by other than original tool developers. This paper investigates the Wehe tool’s traffic differentiation detection’s validity, focusing on its methodology, end-to-end setup, and TD detection mechanism.

The primary contributions of this paper are,

  1. We study the various challenges associated with traffic differentiation detection. We present the categorization of these challenges based on their source, e.g., such as protocol and operational environment.

  2. We take the ”Wehe” tool as a case study and demonstrate the categorized analysis or validation of such tools. The previously identified challenges serve as an aperture to get more insight into the operations of these tools.

  3. We present the validation results generated over the validation setup using customized client-server and publicly available user-client of a Wehe tool. These results have surfaced various issues with the tool.

  4. We also provide solutions to these issues wherever possible.

I-a Related work

Many times the measurement setups target a specific aspect of the underlying system for measurements. Each of these aspects poses challenges to the measurement setup. [18] divides the targeted system as different traffic scenarios and then discusses the challenges in measuring various parameters associated with those use-cases. [3] targets whole Internet of Things (IoT) system for measurement. It divides the system into smaller subsystems, operations, associated protocols. It then identifies the challenges associated with designing an individual subsystem or operations, measuring different parameters linked to traffic scenarios/use-cases, and choosing protocols.

The literature contains the validation of many tools and systems in the network measurement field. The validation process described in [11] divides the entire process as system verification and network performance verification. The end node traffic analysis for different traffic streams that varies its parameters like “Tos” or varies the network load validates the QoS. It captures it in various performance metrics such as latency, jitter. The other validation method described in [17] divides the tools/systems into different categories based on their intended use, e.g., replay generators, specific scenario generators for validation. It defines the separate procedure for the verification of each type of generator. It captures the validation results in different metrics that are also categorized based on generated traffic characteristics, e.g., such as packet, flow, QoS.

I-B Background

This section describes mechanisms used by TD detection measurement setups for various operations and their importance in TD detection. It also covers a brief description of various existing tools.

I-B1 Existing TD detection tools

Many tools have been developed so far for traffic differentiation detection. While some tools focus on detecting an anomaly in users’ Internet traffic, others target traffic in backbone ISPs. There are two commonly used techniques for detecting TD in users’ Internet traffic. One type of approach passively monitors traffic [25]. In such cases, the end-result or TD result is not immediately available to the user. Instead, the tool provides the aggregated result of traffic differentiation over the given ISP. Another type of detection technique uses specially crafted probing traffic - called Active probes. It analyses network response to probing traffic to detect any anomaly. [6, 10, 22, 28, 14, 15, 19, 12] describes measurement setups based on such active probing. It uses traffic parameters such as packet loss, latency, packet sequence, or pattern to identify network operation characteristics or detect anomalies. Some tool uses multiple types of probing traffics called active differential probes. While one traffic type undergoes standard network middle-box processing, the other traffic type is supposed to evade any traffic differentiation. Typically, these traffic types contain traffic similar to original application traffic and other reference or control traffic. It compares the network responses for the original application traffic to that of reference or control traffic. [5] and [16] are examples of such probing techniques.

I-B2 HTTP based client-server communication

The client-server is a system of two devices that communicate using a standard protocol over the dedicated logical link. The client establishes a connection using a socket that is identified uniquely by IP address and port number. There are many parameters associated with the socket. The keep-alive parameter is one such parameter that defines the time duration for which the socket can be idle. The socket provides the APIs to read and write data in the socket.

The HTTP based client-server uses HTTP[20]/HTTPS[23] as a communication protocol. HTTPS is a secured version of the HTTP application layer protocol. It uses the TLS (Transport Layer Security) for providing channel security. The use of HTTP protocol abstracts the underlying networking mechanism. Hence, the end-to-end connection appears to be over a single dedicated communication channel even though the actual communication uses multiple dynamically allocated intermediate network nodes. The HTTP protocol provides commands like “GET”, “POST,” for the client and server to communicate. Fig. 1 shows the typical HTTP command-response sequence.

Fig. 1: HTTP protocol message sequence

The HTTP request has a “GET resource HTTP/1.1” syntax. The resource field contains the resource’s public address, e.g. “” or file name and its path on the requested server. The user-client accessing the specific Internet resource supplies this information. The HTTP request message is usually accompanied by its header that contains the “hostname” or the server name and “User-agent” that identifies the resource requesting entity. It also includes any other request specific information such as “Language” and “Coding.” “HTTP/1.1 200 OK” is an example of a successful HTTP response. It is also accompanied by its header information that usually contains the server identification and information regarding the requested resource such as “Content-Length”.

I-B3 Transport layer security

Transport layer security (TLS) [7] is an Internet protocol that provides channel security to transport layer protocol communication. It establishes a secure tunnel between two machines as soon as they create the transport layer logical channel between them. This procedure is called a ‘TLS handshake.’ Even though there are advanced variations of the TLS handshake sequence, the typical handshake is as shown in Fig. 2. The supported TLS versions on both side and server’s security certificate are crucial for TLS. Once established, the secure channel exchanges the data in an encrypted format that is not easily decryptable by network middle-boxes.

Fig. 2: TLS handshake sequence

I-B4 NATs and Proxies

NAT or Network Address Translator [24] is a method of mapping IP addresses defined in one unregistered private domain to the public domain with globally unique registered addresses. Such translation is required either due to the non-willingness of exposing internal IP addresses for privacy reasons or extending the public IP address’s scope. In NAT enabled systems (as shown in Fig. 3), any public IP address from the pool represents a device within the network using NATs due to dynamics address mapping. The NAT devices are unidirectional as well as bi-directional. The unidirectional NATs permit the session establishment in one direction only, i.e., outbound from the private network.

Fig. 3: NAT enabled network

The proxy is a device that connects either multiple users (forward proxy) or servers (reverse proxy) to the Internet using its single public IP address. The “transparent” proxies exchange data between client and server transparently, i.e., without affecting the end-to-end communication. Other proxies exchange data using two distinctly different connections - one towards the client and another towards the server. It requires special attention to transport layer security (TLS) operations as proxy negotiates the TLS channel setup on behalf of the user-client in this case.

I-B5 Traffic replay mechanisms

The traffic replay mechanism mimics the client and server-side behavior for given application data exchange and the underlying protocol. There are many traffic replay tools available. Tcpreplay [26] is one such replay tool that mimics the transport layer behavior for the given stream of transport layer packets. Another example of a layer-specific replay is FlowrReplay that runs at the application layer. The layer-specific replay tools are many times protocol dependent. The technique roleplayer proposed in [4] is capable of replaying application layer data in a protocol-independent manner. The replay layer selection (refer Fig. 4) for traffic replay is crucial as it affects the receiver side’s data collection as well as expected network response. The TCP layer replay adversely affects the traffic analysis as it requires special permission to collect traffic data for analysis.

Fig. 4: Data replay techniques

I-B6 Internet services’ performance comparison

The end-to-end connection between client and server for the Internet services is not dedicated. The best-effort nature of the IP layer packet forwarding results in packets from the same traffic stream to take different paths. The performance fluctuations due to such routing may get normalized with a large amount of data transfer. For services using different servers geo-located at various locations, the physical path difference induces varying congestion levels. The performance comparison of streams experiencing different congestion is not reliable. Another factor that impacts the direct comparison of performance is traffic management policies applied by the network. It is directly dependent on the network device’s traffic stream classification mechanism. Often, servers limit/vary the transmission speed to utilize their network resources better, matching the service’s underlying speed requirement. Dynamic Adaptive Streaming over HTTP (DASH) is one such technique. It varies from service to service, making the direct comparison of services with different server transmission speed unreliable for any conclusion. Fig. 5 shows the effect of the variations, as mentioned earlier in the performances of Internet services.

Fig. 5: Variation in performance of services while traversing the Internet

I-B7 What is Validation?

The validation of the software tool is not very uncommon. Its need is recognized, and standardization bodies like ISO and IEEE formalized its process.

  • ISO 17025E [1]: Validation is the provision of objective evidence that a given item fulfills specified requirements, where the specified requirements are adequate for the intended use.

  • IEEE 1012-1998 [2]: The purpose of the Software Validation process is to provide objective evidence for whether the outcomes satisfy the specified requirements, solve the right problem, and satisfy the intended use and user needs in an operational environment.

The remaining paper’s organization is as follows. The Sec. II describes all identified challenges in measurement setup for TD detection. While Sec. III describes the Wehe tool and its mechanisms in the context of identified challenges, Sec. IV provides the validation results. Sec V maps the results of validation results to corresponding design choices and one of the identified challenges. Sec. VI concludes the paper with conclusion and future work.

Ii Challenges in TD detection measurement setup development

In this paper, we targeted measurement setups for traffic differentiation detection. These measurement setups primarily consist of probing traffic generator, traffic data capturing system, and TD detection engine. The remainder of the section describes the challenges in engineering each of the system’s components.

Ii-a General system design

The TD detection system is either an end-to-end client-server system or only a user-client based system. The only user-client-based system considers the intermediate network nodes as a remote terminating node for making measurements or performs local measurements. Such systems target intermediate network nodes with precise probing data such as Time-to-live (TTL) value in Internet Protocol (IP) header or any other network management parameters. The chosen parameter allows the user-client to terminate the probing traffic flow at a specific remote network node. Even though the user-client probing data can achieve such precision theoretically, network configurations often disrupt intermediate nodes’ intended behavior.

The end-to-end client-server type systems have more control over the communication between end-nodes. However, the degree of conscious control is dependent on the communication protocol or data exchange layer. The systems exchanging data at the application layer using HTTP-like protocols have more control over the data capture and content setting than done at the lower layer or their protocol, such as Transport Control Protocol (TCP). The direct injection of data at the lower layer provides more control over the data rate, but it complicates the system design and data capture for analysis. It is primarily due to bookkeeping required for the session to packet mapping and permission required on the operating system side to perform such tasks - the user-client intended for the general public use finds it challenging to acquire such user permissions. Another interesting challenge is to incorporate third party supporting software. Many times the user is not willing to get such supporting software on their system.

Ii-B Probing traffic generation

The probing traffic is a traffic stream specially crafted for the intended tool. It can be a train of IP or TCP layer packet with customized headers or legitimate application-layer traffic with customized data rates and associated mechanisms. In any case, defining a precise hypothesis based on the tool’s desired operation is crucial for traffic generation. The tool having a methodology based on network management responses from the intermediate nodes is not in favor of using application-level traffic generators. In such cases, it may not have proper control over the required lower layer header information or may not respect the application layer’s data rate due to additional processing at the lower layer. The other example could be using inappropriate data content or rates not aligned with the underlying methodology, such as using the wrong traffic stream identifiers in the data.

Ii-C Network responses

The network response to the probing traffic is a fundamental input to the TD detection mechanism. The type of network response is dependent on the underlying methodology of the tool. Once fixed, the expected response from the network changes with the network configurations. Often, network nodes do not respond as expected to network management messages or do not recognize the probing traffic in a specific manner. It happens mainly due to provisions in the associated Internet standard to deviate from the typical response. It is also a result of network policies that are proprietary on which Internet standards do not have any control. It is challenging to define an expected network response or design a system always to achieve the expected network response.

Ii-D Operational challenges

The tool sufficiently well tested in a lab environment faces many issues in real-world scenarios. It is due to over-provisions in the tool’s lab environment or simplified view of real-world networks considered. The specific network configuration or the unreachable remote node situation fails the tool’s implementation. The advancement of networking technologies modifies various types of inter-node connection mechanisms and associated devices. Many times the existence of network devices is also ignored. [27] covers the variation in the middle-boxes. Note that the tool implementation often overlooks this aspect as it is not part of its core methodology.

Ii-E TD Detection

The TD detection algorithm is the core engine of the measurement setup. Most of the time, it needs a specific type of input for its proper operations derived from the observed performance. The average throughput curves of probing traffic or sequence of network management response packets are examples of input information. The network responses can produce glitches in the probing traffic performance. Many times input conditioning mechanisms are used to filter out such glitches or irregularities such as throughput bounds. Another challenge is traffic generation or data capturing mechanism fails to provide appropriate input to the detection algorithm, e.g., non-completion of data capturing.

Ii-F Protocol specific challenges

Internet services follow the layered architecture with specific protocols governing the behavior of the individual layer. While “Internet Protocol” (IP) is the de-facto standard for the network layer, many alternatives are available for the transport and application layers. These alternatives include widely used application layer protocols - ‘HTTP” and “HTTPS.” The application layer protocol changes the application data representation on the Internet, e.g., the “HTTP” traffic is plaintext, and the “HTTPS” is encrypted. The advent of the “Quick UDP Internet Connection” (QUIC) [8] protocol provides an alternative to widely deployed TCP protocol as a transport layer protocol. The “QUIC” protocol has TCP-like properties over User Datagram Protocol (UDP). While TLS provides data encryption services to TCP protocol, the QUIC has an in-built data encryption mechanism for generating data for HTTPS communication. Thus the combination of application, transport layer protocols changes the data generation and representation over the Internet. The Internet services differ in selecting this combination, e.g., YouTube utilizes QUIC while Netflix uses TCP and TLS combination. The probing traffic generation and resulting TD detection mechanism need to tackle this service-dependent variation in the combination of protocols.

Ii-G Other challenges

Internet services employ various mechanisms to cope with the fluctuation in available bandwidth to provide a seamless end-user experience. Dynamic adaptive streaming over HTTP (DASH) is one such technique that modifies traffic characteristics such as speed or content characteristics such as coding rate. Each streaming service uses tailored techniques as per their requirements, and they are proprietary. Measurement setups such as passive monitoring systems face this challenge of normalizing various streaming services’ performances for their difference in bandwidth fluctuation coping techniques. Measurement setup employing the active probing that mimics original service traffic tends to transmit a probing traffic stream that saturates the available bandwidth, similar to point-to-point (p2p) traffic. Such traffic streams may lose their relevance as original service traffic.

Internet services use a specific port number for communication. It is as per port reservations defined in Internet standards [9], e.g., port for HTTP traffic and for SSH (Secure Shell ) traffic. Thus the port number used in the transmission of probing data plays a vital role in traffic classification by network middle-boxes. Using correct data to be used on the pre-assigned port number for a given service is a challenging task. It requires a thorough understanding of network traffic classification on that port.

Iii Case Study : Wehe - TD detection tool for mobile environment

The Wehe [16] is the first tool for the detection of traffic differentiation over mobile networks (e.g., cellular and WiFi). It is available as an App on Android and the iOS platform. The tool supports TD detection for many popular services such as Netflix, YouTube. The tool runs TD detection tests by coordinating with its server, called the “replay server”. The replay server keeps track of active user-clients and maps replay runs to correct user’s service.

Iii-a Traffic generation

The Wehe uses the “record-and-replay” method for generating probing traffic. The user-client exchanges the probing traffic with the replay server as per the replay script during the replay phase. The replay script uses the application-level network log data from the original service. It captures the application’s traffic behavior, including the port number, data sequence, and timing dependencies from logs. Preserving timing is a crucial feature of Wehe’s approach. It expects network devices to use this information in case of non-availability of any other means to classify applications, e.g., HTTPS encrypted data transfer with encrypted SNI. The Wehe tool uses two types of probing traffic streams. While one stream is the same as the original application-level network trace, another traffic stream differs substantially from the first traffic stream. In one approach, Wehe uses the VPN channel to send a second probing traffic stream. This approach uses the meddle VPN

[21] framework for data transfer and server-side packet capture. Another approach uses the bit-reversed version of the first traffic stream sent one the same channel. Currently, the Wehe uses the latter approach due to its superior results.

Iii-B Over the network response expectations

The Wehe is a differential detector tool that compares the network responses for two types of traffic streams generated by the tool: original and control replay. The original replay uses the network traffic generated by the original application. This service-specific information present in the original replay is useful for network devices with DPI capability to identify and classify the service correctly. So, the original replay’s traffic performance over the Internet closely resembles the original application traffic on the same network. While original replay is exposed for detection to network devices, the traffic streams with bit reversed data or control replay is equally “not detectable” for classification. Thus it is expected that the control replay traffic evades the content-based application-specific traffic differentiation. The performances of two such traffic streams (detectable and non-detectable) differ if network devices apply different traffic management or traffic differentiation on each traffic stream as per content-based classification.

Iii-C TD detection scenario expectations

The Wehe uses the throughput performances of original replay and control replay to detect TD. The TD detection algorithm compares the throughput performances of its traffic streams. The methodology uses the throughput as a comparison metric due to its sensitivity to bandwidth-limiting traffic shaping. However, the tool expects that the TD detection algorithm does not detect TD based on throughput for traffic streams with traffic rates below the shaping rate. The rationale is that the shaper can not affect the performance of such an application stream. Many times both traffic streams get affected by other factors such as signal strength, congestion. It creates an irregularity in the received performance due to bandwidth volatility. It is mentioned to be leading to incorrect differentiation detection. The tool performs multiple test replays to overcome the effect of bandwidth volatility.

Iii-D Operational requirements

The Wehe server needs side-channels for each client to associate it with precisely one app replay. This side-channel supplies information about replay runs to the server. Each user directly connected to the Wehe replay server is uniquely identifiable on the server-side with an associated IP address with side channels mapping each replay to exactly one App.

The other operational requirement is that the Wehe client-server communication uses customized socket connections with specific keep-alive behavior. Sometimes, the usage of translucent proxies by user-client modifies this behavior. The Replay server handles this situation by handling such unexpected connections. The protocol-specific proxies, e.g., HTTP proxy, connect the user-client to the server through itself for specific port numbers, e.g., 80/443 for HTTP/HTTPS. Nevertheless, it allows the user-client to connect to the server for connections using other protocols directly. The side-channels of Wehe does not use HTTP/HTTPS connection. So the IP address for the same user differs for side-channel and replay runs. Wehe server detects such connections and indicates such connections to the user-client using a special message. The special message triggers the exchange of further communication with a customized header.

Iii-E Challenges of validating Wehe

The Wehe tool is straightforward to use TD detection tool — the requirement changes when using it for its validation. The validation process may need to launch only one type of replay for different services during one test or may need to launch all replays in parallel. These are not requirements related to TD detection, Wehe’s primary goal, so understandably not supported. Hence the validation of Wehe’s working in such scenarios needs a specific client-server setup. Here the challenge is to separate the intended scenario-specific Wehe’s mechanism so that the resulting system still mimics Wehe’s actual behavior.

Wehe does not provide error/failure notifications in all scenarios. Instead, it prompts the user to reopen the App. As a result, the validation setup loses the vital feedback information regarding the error/failure induced by its validation scenario.

Iv Validating Wehe

Our study focuses on validating the network responses for the replayed traffic streams, TD detection scenarios, and operational feasibility in various network configurations. While operational feasibility is validated using the publicly available “Wehe” Android app on Google Playstore, TD detection scenarios are validated using theoretical arguments. The validation of network responses requires bandwidth analysis of the received traffic stream. This analysis requires the network logs for the specific replay performed as per the validation scenario. The replay done on the device and multiple other streaming services running in parallel is one such scenario. Wehe app does not immediately provide such network logs for the replays after the completion of tests. So, we implemented the user-client and server that mimics the behavior of the Wehe tool.

Fig. 6: Wehe app validation setup

Fig. 6 shows our client-server setup for validating Wehe tool. Our user-client uses the same HTTP GET commands as the Wehe tool. Our server mimics the behavior of the replay server for responding to user-client requests. Moreover, our setup has a provision to perform multiple replays in parallel. The validation of specific scenarios requires this provision. Our validation setup does not need administrative channels and overheads, e.g., side-channels. Our server always needs to support a single user-client. The validation of scenarios with multiple clients uses the Wehe App directly due to the non-requirement of associated traffic analysis.

(a) Only Wehe
(b) Wehe plus one service
(c) Wehe plus two services
Fig. 7: Effect of network load on Wehe’s traffic stream performances

Iv-a Validation results

We validated the Wehe tool using validation setup, Wehe App tests, and theoretical analysis. This section covers the results of the validation.

Iv-A1 Notion of TD for services not exhausting available bandwidth

Wehe’s replay server uses the same timings between application data transfer as that of original application traffic. Such a transmission strategy is expected not to exhaust available bandwidth. Hence the effect of source rate modulation due to overshooting of traffic rate above available bandwidth is expected to be avoided. It makes, original and control replays’ show similar traffic performances unless deliberately modified by network policies.

Nevertheless, this expectation does not always get satisfied as it is dependent on the network load at the user device while performing Wehe tests. Instead of the source rate, the application layer’s data reception rate gets modulated as per the device’s current network load. Such perturbations create discrepancy as the effect of time-varying current network load on the probing traffic is also time-varying and may not always be the same. The back-to-back replay strategy of Wehe ensures that probing traffic gets affected differently by the current network load. Under such network load on the device side, the notion of services not exhausting available bandwidth ceases to exist along with its benefits.

Iv-A2 Traffic differentiation of original replay

The Wehe uses the traffic trace from the original service for generating replay scripts. The replay scripts preserve the application data and its timing relationship. This replay script is used over the original network and also on networks that are differently geo-located. As traffic shaping rate varies across networks for the same service (as mentioned in [13]), the traffic rate preserved in the replay script can be different from the traffic shaping rate of the currently considered network. The replay traffic rate can even be lower than the traffic shaping rate.

The Wehe methodology does not detect traffic differentiation if the replay script’s traffic rate is lower than the sharing rate as it does not affect the traffic stream. Such replay scripts can never detect traffic shaping on such networks as the shaping rate is above the probing traffic rate. Thus Wehe App’s TD detection capability is limited by the replay script’s ability to render traffic rate above network shaping rate.

Iv-A3 Usage of port number 80

The replay script preserves the data in the applications’ original network trace. The original application uses the plain-text data while using port number , but the port number uses encrypted application data for transmission. Wehe replay script directly uses the encrypted data from the application’s network trace and transmits it on port number 80. In such cases, the Wehe tool expects its original replay traffic stream to be classified correctly by network devices using encrypted application data. It is impossible for such data on port number 80 as encrypted traffic data can not expose its identification to the network device. Thus Wehe tool can not generate the required traffic streams for services running on the port number due to default usage of the port number for replay run.

Iv-A4 Traffic load governed network behavior

Note that scarcity of resources prompts networks to apply certain network traffic management, especially in heavy network load, that are beneficial for all active services throughout its network, e.g., QoS based traffic management. We validated the effect of such traffic management on the performances of both control and original replays. The validation uses the following three scenarios for the validation,

  • Replaying only Wehe’s two traffic streams without any load on the network (Fig 7(a))

  • Replaying Wehe’s three traffic streams with one additional streaming services running in parallel (Fig. 7(b))

  • Replaying Wehe’s three traffic streams with additional streaming services running in parallel (Fig. 7(c))

The performances in Fig. 7(a) show that performances of traffic streams generated by the Wehe tool are the same under no additional network load conditions. As network load increases, the performance of control replay deviates from that of original replay and at higher level (Fig. 7(b)). While performance of control replay further deviates from original replay on lower side, two original replays still shows similar performances as shown in Fig. 7(c). It invalidates the Wehe tool’s expectation of control replay not getting differentiated. It also invalidates the claim of the tool of detecting the TD due to total bandwidth.

Iv-A5 Ensuring no TD detection for traffic streams with rates below shaping rate

Even though the Wehe tool does not intend to detect any TD below the considered network’s actual shaping rate, the time-varying effect of background network load at the user device side can make the Wehe tool detect TD. Network devices do not induce this TD. The detection of TD under such scenarios makes the Wehe tool unreliable.

Iv-A6 Issues related to working with HTTP Proxies

As per Wehe tool documentation, it supports the user clients using HTTP proxies using a special message and provision to accept HTTP requests using a socket with unexpected keep-alive behavior. We attempted the Wehe test using the HTTP proxy and found that it does not work.

Iv-A7 Conducting Wehe tests from multiple devices within the same sub-net

The side-channels are introduced in Wehe design to support multiple user-clients simultaneously. Side-channels also assist in identifying the mapping between user-client and a combination of IP addresses and ports. It is useful in the case of networks using NATs. We validated Wehe’s support for multiple clients and NAT enabled network using two different tests. First, we connected two user-clients from within the same subnet, i.e., clients sharing the same public IP address. In one test, the Wehe tool tests the same service on both devices, e.g., Wehe App on both devices tests for YouTube. The result shows that the Wehe test completed finishing on only one device while Wehe App abruptly closed on another device. We repeated the same scenario, but this time Wehe tests different services, e.g., Wehe on one device testing YouTube during another testing Netflix. We found that the Wehe test on one device completes properly while Wehe on another device throws an error on the screen, informing the user that another client is already performing the test, as shown in Fig. 8. These tests show that Wehe does not support multiple devices if they share the same IP address.

Fig. 8: Wehe apps running on multiple devices within the same subnet and testing different service

While side-channel is useful to identify each replay from a user-client connected directly to the Wehe replay server, it is not useful in the network using NAT devices. Multiple users share the same IP address in the case of NAT. In such cases, the side channel can not uniquely map each replay run to a client. It limits the usage of Wehe to only one active client per replay server and ISP and application. This limitation is documented by Wehe developers as well.

V Wehe Validation summary

The Wehe tool validation results have surfaced its noncompliance to TD detection in some scenarios and limitations. It results from specific design or implementation choices for traffic generation and TD detection. In this section, we will study these choices.

V-a Traffic generation

The design of preserving application data and its timing from the original application network trace in replay script is crucial for Wehe’s tool. Sometimes, it hinders the TD detection, as explained in Sec. IV-A2. This design choice limits the TD detection capability as the traffic shaping rate is not the same across different ISPs.

The Wehe detects the content-based TD. This requirement leads the Wehe to design the probing data transmission on port number 80. The replay script based on the original application trace, as it is, does not lead to expected traffic classification by ISPs in all cases if it uses port 80 as described in Sec. IV-A3.

The Wehe designed the probing traffic as the traffic stream with original application data and traffic streams with a bit-reversed version of the same application data. It tends to provide unreliable throughput performances for comparison to detect TD, as explained in Sec. IV-A4.

V-B TD detection

Wehe’s direct performance comparison design needs that performances of probing traffic are only affected by network policies. The use of exact application data and its timings from the original application provides this provision as its side effect of not exhausting the entire available bandwidth under a specific scenario. Another design choice of back-to-back replays tries to ensure it by making Wehe require minimum bandwidth to exchange probing traffic. Nevertheless, it adds more uncorrelated perturbations in the probing traffic performances under heavy time-varying load at the user-client side. The provision required for direct performance comparison is not guaranteed in specific traffic load scenarios, as explained in Sec. IV-A1 and disturbed by back-to-back replay design.

The non-consideration of total network load at the user-client side in the Wehe tool design and back-to-back replay design makes the Wehe tool detect the TD due to background traffic load on the probing traffic performances.

V-C Operation environment

The Wehe implements the side-channel design to tackle various issues due to intermediate network devices, such as proxies or NAT devices. Nevertheless, it complicates the HTTP protocol based client-server communication and makes it non-manageable in the long run.

Vi Conclusion

The debate on “Net-neutrality” and events pointing towards its possible violations have led to the development of tools to detect deliberate traffic discrimination on the Internet. Given the complex nature of the Internet, neutrality violations are not easy to detect, and tools developed so far suffer from various limitations. In this paper, we study many challenges in developing a tool for detecting violations. We take the validation as an application of our study of challenges in TD detection systems. As a case study, we focus on the Wehe tool and demonstrate the categorized analysis or validation of traffic differentiation detection tools. The Wehe tool is one of the most recent tools to detect neutrality violations. Despite Wehe’s vast utility and possible influences over policy decisions, its mechanisms are not yet fully validated by researchers other than original tool developers. Our validation uses the Wehe App, a client-server setup mimicking Wehe’s behavior and theoretical arguments. We validated the Wehe app for its methodology, traffic discrimination detection, and operational environments.


  • [1] Anonymous (2017-11)(Website) External Links: Link Cited by: 1st item.
  • [2] Anonymous (2017) IEEE standard for system, software, and hardware verification and validation. IEEE Std 1012-2016 (Revision of IEEE Std 1012-2012/ Incorporates IEEE Std 1012-2016/Cor1-2017) (), pp. 1–260. Cited by: 2nd item.
  • [3] E. Balestrieri, L. De Vito, F. Lamonaca, F. Picariello, S. Rapuano, and I. Tudosa (2019-01) Research challenges in measurement for internet of things systems. ACTA IMEKO 7, pp. 82. External Links: Document Cited by: §I-A.
  • [4] W. Cui, V. Paxson, N. Weaver, and R. Katz (2006-02) Protocol-independent adaptive replay of application dialog. In 13th Annual Network and Distributed System Security Symposium (NDSS), , pp. 27–27. Cited by: §I-B5.
  • [5] M. Dischinger, M. Marcon, S. Guha, K. Gummadi, R. Mahajan, and S. Saroiu (2010-Apr.) Glasnost: enabling end users to detect traffic differentiation. In 7th USENIX Conf. Networked Systems Design and Implementation, Vol. , pp. 27–27. External Links: Document, ISSN Cited by: §I-B1, §I.
  • [6] M. Dischinger, A. Mislove, A. Haeberlen, and K. P. Gummadi (2008-Oct.) Detecting bittorrent blocking. In Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, pp. 3–8. Cited by: §I-B1.
  • [7] Z. Hu, L. Zhu, J. Heidemann, A. Mankin, D. Wessels, and P. Hoffman (2018-08) The Transport Layer Security (TLS) Protocol Version 1.3. RFC Technical Report 7858, IETF, IETF. Note: Internet Requests for Comments External Links: ISSN 2070-1721, Link Cited by: §I-B3.
  • [8] J. Iyengar and M. Thomson (2020-08)(Website) External Links: Link Cited by: §II-F.
  • [9] T. Joe, L. Eliot, M. Allison, K. Markku, O. Kumiko, S. Martin, E. Lars, M. Alexey, E. Wes, Z. Alexander, T. Brian, I. Jana, M. Allison, T. Michael, K. Eddie, and N. Yoshifumi (2020-Sept)(Website) External Links: Link Cited by: §II-G.
  • [10] P. Kanuparthy and C. Dovrolis (2011-11) ShaperProbe: end-to-end detection of isp traffic shaping using active methods. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, pp. 473–482. Cited by: §I-B1.
  • [11] E. Karoly (2000-02)(Website) External Links: Link Cited by: §I-A.
  • [12] V. S. Khandkar and M. K. Hanawal (2020) Detection of traffic discrimination in the internet. In 2020 International Conference on COMmunication Systems NETworkS (COMSNETS), Vol. , pp. 677–679. Cited by: §I-B1.
  • [13] F. Li, A. A. Niaki, D. Choffnes, P. Gill, and A. Mislove (2019) A large-scale analysis of deployed traffic differentiation practices. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM ’19, New York, NY, USA, pp. 130–144. External Links: ISBN 9781450359566, Link, Document Cited by: §IV-A2.
  • [14] G. Lu, Y. Chen, S. Birrer, F. E. Bustamante, C. Y. Cheung, and X. Li (2007) End-to-end inference of router packet forwarding priority. In IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications, Vol. , pp. 1784–1792. Cited by: §I-B1.
  • [15] R. Mahajan, M. Zhang, L. Poole, and V. Pai (2008) Uncovering performance differences among backbone isps with netdiff. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pp. 205–218. Cited by: §I-B1.
  • [16] A. Molavi Kakhki, A. Razaghpanah, A. Li, H. Koo, R. Golani, D. Choffnes, P. Gill, and A. Mislove (2015-10) Identifying traffic differentiation in mobile networks. In Proceedings of the 2015 Internet Measurement Conference, , pp. 239–251. Cited by: Challenges in Net Neutrality Violation Detection: A Case Study of Wehe Tool, §I-B1, §I, §III.
  • [17] S. Molnár, P. Megyesi, and G. Szabó (2013) How to validate traffic generators?. In 2013 IEEE International Conference on Communications Workshops (ICC), Vol. , pp. 1340–1344. Cited by: §I-A.
  • [18] R. Narisetty and D. Gurkan (2014) Identification of network measurement challenges in openflow-based service chaining. In 39th Annual IEEE Conference on Local Computer Networks Workshops, Vol. , pp. 663–670. Cited by: §I-A.
  • [19] (2008-06)(Website) External Links: Link Cited by: §I-B1.
  • [20] Ed. R. Fielding (2014-06) Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. RFC Technical Report 7230, IETF, IETF. Note: Internet Requests for Comments External Links: Link Cited by: §I-B2.
  • [21] A. Rao, J. Sherry, A. Legout, A. Krishnamurthy, W. Dabbous, and D. Choffnes (2012-12) Meddle: middleboxes for increased transparency and control of mobile traffic. In CoNEXT Student Workshop, ACM, pp. 65–66. External Links: Document Cited by: §III-A.
  • [22] R. Ravaioli, G. Urvoy-Keller, and C. Barakat (2015-Sep.) Towards a general solution for detecting traffic differentiation at the internet access. In 2015 27th International Teletraffic Congress, Vol. , pp. 1–9. External Links: Document, ISSN Cited by: §I-B1.
  • [23] E. Rescorla (2000-05) HTTP Over TLS. RFC Technical Report 2818, IETF, IETF. Note: Internet Requests for Comments External Links: ISSN 2070-1721, Link Cited by: §I-B2.
  • [24] P. Srisuresh and K. Egevang (2001-01) Traditional IP Network Address Translator (Traditional NAT). RFC Technical Report 3022, IETF, IETF. Note: Internet Requests for Comments External Links: Link Cited by: §I-B4.
  • [25] M. B. Tariq, M. Motiwala, N. Feamster, and M. Ammar (2009) Detecting network neutrality violations with causal inference. In Proceedings of the 5th International Conference on Emerging Networking Experiments and Technologies, CoNEXT ’09, New York, NY, USA, pp. 289–300. External Links: ISBN 9781605586366, Link, Document Cited by: §I-B1.
  • [26] ()(Website) External Links: Link Cited by: §I-B5.
  • [27] N. Vallina-Rodriguez, S. Sundaresan, C. Kreibich, N. Weaver, and V. Paxson (2015) Beyond the radio: illuminating the higher layers of mobile networks. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys ’15, New York, NY, USA, pp. 375–387. External Links: ISBN 9781450334945, Link, Document Cited by: §II-D.
  • [28] Y. Zhang, Z. M. Mao, and M. Zhang (2009) Detecting traffic differentiation in backbone isps with netpolice. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, IMC ’09, New York, NY, USA, pp. 103–115. External Links: ISBN 978-1-60558-771-4, Link, Document Cited by: §I-B1.