Proportional Fair RAT Aggregation in HetNets

by   Ehsan Aryafar, et al.

Heterogeneity in wireless network architectures (i.e., the coexistence of 3G, LTE, 5G, WiFi, etc.) has become a key component of current and future generation cellular networks. Simultaneous aggregation of each client's traffic across multiple such radio access technologies (RATs) / base stations (BSs) can significantly increase the system throughput, and has become an important feature of cellular standards on multi-RAT integration. Distributed algorithms that can realize the full potential of this aggregation are thus of great importance to operators. In this paper, we study the problem of resource allocation for multi-RAT traffic aggregation in HetNets (heterogeneous networks). Our goal is to ensure that the resources at each BS are allocated so that the aggregate throughput achieved by each client across its RATs satisfies a proportional fairness (PF) criterion. In particular, we provide a simple distributed algorithm for resource allocation at each BS that extends the PF allocation algorithm for a single BS. Despite its simplicity and lack of coordination across the BSs, we show that our algorithm converges to the desired PF solution and provide (tight) bounds on its convergence speed. We also study the characteristics of the optimal solution and use its properties to prove the optimality of our algorithm's outcomes.



There are no comments yet.


page 9


Market-based Short-Term Allocations in Small Cell Wireless Networks

Mobile users (or UEs, to use 3GPP terminology) served by small cells in ...

Achieving Arbitrary Throughput-Fairness Trade-offs in the Inter Cell Interference Coordination with Fixed Transmit Power Problem

We study the problem of inter cell interference coordination (ICIC) with...

Distributed Learning for Proportional-Fair Resource Allocation in Coexisting WiFi Networks

In this paper, we revisit the widely known performance anomaly that resu...

Fairness-Oriented User Association in HetNets Using Bargaining Game Theory

In this paper, the user association and resource allocation problem is i...

Fundamentals on Base Stations in Cellular Networks: From the Perspective of Algebraic Topology

In recent decades, the deployments of cellular networks have been going ...

Energy Efficient and Fair Resource Allocation for LTE-Unlicensed Uplink Networks: A Two-sided Matching Approach with Partial Information

LTE-Unlicensed (LTE-U) has recently attracted worldwide interest to meet...

Proactive Resource Management in LTE-U Systems: A Deep Learning Perspective

LTE in unlicensed spectrum (LTE-U) is a promising approach to overcome t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The increasing demand for wireless data has led to denser and more heterogeneous wireless network deployments. This heterogeneity manifests itself in terms of network deployments across multiple radio access technologies (e.g., 3G, LTE, WiFi, 5G), cell sizes (e.g., macro, pico, femto), and frequency bands (e.g., TV bands, 1.8-2.4 GHz, mmWave), . To realize the gains associated with such heterogeneous networks (HetNets), consumer (client) devices are also being equipped with an increasing number of radio access technologies (RATs), and some are already able to simultaneously aggregate the traffic across multiple RATs to increase throughput [1].

To support such traffic aggregation on the network side, the 3GPP (3rd generation partnership project) has been actively developing multi-RAT integration solutions. The introduction of LWA (LTE-WiFi Aggregation) as part of the 3GPP Release 13 [2] was a step in this direction. LWA allows using both LTE and WiFi links for a single traffic flow and is generally more efficient than transport layer aggregation protocols (e.g., MultiPath TCP), due to coordination at lower protocol stack layers. LWA’s design primarily follows the LTE Dual Connectivity (DC) architecture (defined in 3GPP Release 12 [3]), which allows a wireless device to connect to two LTE eNBs that are on different carrier frequencies, and utilize the radio resources that belong to both of them. Currently, the 3GPP is working on a solution to support below IP (layer 2) multi-RAT integration across any combination of RATs, including LTE, WiFi, 802.11ad/ay, and 5G New Radio (NR) [4]. The proposed architecture would allow for dynamic traffic splitting across RATs for each client, which can lead to a significant increase in the system performance (e.g., total throughput).

However, it is difficult to design resource allocation algorithms for each BS111We use “BS” generically to mean an LTE eNB, WiFi AP, etc. that realize the performance benefits of such integrated HetNets. Specifically, (i) backhaul links from different BSs in HetNets show diverse capacity and latency characteristics and depend on the underlying backhauling technology. For example, cable and DSL have on average 28 and 62 ms roundtrip latencies, respectively [5, 6]. The latency can be even higher when a network operator uses a third party ISP to communicate with its BSs (e.g., a mobile operator that uses a wired ISP to control its WiFi BSs). Such latencies make it infeasible for BSs to communicate with each other or a central controller for real-time resource allocation at each BS. As a result, any practical resource allocation algorithm for multi-RAT HetNets should be fully distributed (i.e., autonomously executed by each BS). (ii) Resource allocation has many practical constraints. Conventional BS hardware allows only minor modifications to existing resource allocation algorithms through software updates, limiting the algorithm design space. New algorithms should also incur minimal signaling overhead and computational complexity. Distributed algorithms based on the traditional network utility maximization framework [7, 8] do not meet these requirements, because as we will show later through simulations the resulting algorithms are radically different from how conventional BSs operate, have significant over-the-air signaling overhead, and increase the computational complexity on the client side. (iii) In HetNets, each client has access to a client-specific set of RATs, and receives packets at a different PHY rate on each RAT. These rates are naturally different across clients. This multi-rate property of HetNets makes it particularly challenging to design resource allocation algorithms with performance guarantee. As a result, existing solutions in the literature are all limited to simple setups, e.g., when each client has only two RATs as in the case of LWA [9] or LTE DC [10].

In this paper, we study the problem of resource allocation for traffic aggregation in multi-RAT HetNets. We focus on the proportional-fair (PF) fairness objective as it is widely used and implemented in BSs and provides a balance between fairness and throughput [11, 12]. We first consider PF resource allocation in a single BS, and then use our insights from this case to design a distributed algorithm that meets our three research challenges. We next show that our algorithm converges to an optimal PF resource allocation. The key contributions are as follows:

  • Algorithm Design: We study the basics of PF resource allocation in a single BS to gain intuition for the distributed algorithm design. We show that PF resource allocation in a single BS can be viewed as a special type of water-filling. We generalize this observation to a new fully distributed water-filling algorithm (named AFRA) that makes a minor modification to the conventional single BS algorithm and achieves PF in HetNets.

  • Convergence and Speed: We show that AFRA is guaranteed to converge to an equilibrium as BSs autonomously execute it [Theorem 1] and derive tight bounds on its convergence time (speed) [Theorem 2].

  • Optimality: We first show that at optimality, the sum of the inverse water-fill levels across all BSs is equal to the sum of the weights (numbers that show clients’ priorities) across all clients [Theorem 3]. Next, we use this property to prove that any equilibrium outcome of AFRA is globally optimal

    [Theorem 4]. Finally, we show that at equilibrium the vector of throughput rates across all clients is unique; however, there could be infinitely many resource allocations that realize this outcome [Theorem 5].

  • Practicality: We construct a testbed with programmable BS hardware, and show that we can successfully aggregate the throughput across multiple BSs at the MAC layer. We also show that replacing the conventional resource allocation algorithm on each BS with AFRA can substantially increase the system throughput and fairness.

  • Performance: We conduct extensive simulations to characterize AFRA’s convergence time properties as we scale the number of BSs and clients. We also introduce policies that reduce the convergence time by more than 30%. Finally, we compare the performance of AFRA against DDNUM, a dual decomposition algorithm that we derived from the NUM framework. We show that compared to DDNUM, AFRA is 2-3 times faster with 4-5 times less over-the-air overhead.

This paper is organized as follows. We discuss the related work in Section II. We present the system model and details of AFRA in Section III. In Sections IV and V we prove the convergence and optimality of AFRA. We present the results of our experiments, simulations, and comparisons against DDNUM in Section VI. We conclude the paper in Section VII.

Ii Related Work

We discuss the related work in the areas of multi-BS communication and distributed optimization, and highlight their differences from this paper.

Single-RAT Multi-BS Communication. Prior works have studied the problem of traffic aggregation when a client can simultaneously communicate with multiple same technology BSs. For example, [13]

uses game theory to model selfish traffic splitting by each client in WLANs. On the other hand, the resource allocation problem in HetNets is primarily addressed at the BS side. Similarly, 

[10] proposes an approximation algorithm to address the problem of client association and traffic splitting in LTE DC. Our algorithm (AFRA) goes beyond this and other related work by guaranteeing optimal resource allocation for any number of RATs and BSs. Other works have developed centralized client association algorithms to achieve max-min [14] and proportional fairness [15] in multi-rate WLANs. In contrast, the problem of resource allocation in HetNets needs to be solved in a fully distributed manner.

Multi-RAT Communication. Resource allocation algorithms that realize the capacity gains in HetNets are still in their early stages. The problem of PF resource allocation for LWA was studied in [9]. In the proposed setup, each client has one LTE and one WiFi RAT. Further, there is only a single LTE BS in the network, and each client’s throughput across its WiFi RAT is fixed. Next, the authors propose a water-filling based resource allocation algorithm at the LTE BS that achieves PF. Similarly, we show that the optimal PF resource allocation in a single BS can be interpreted as a form of water-filling. However, we use the observation to design an optimal algorithm for the generic problem with any number of BSs and client RATs, and explicitly model the impact of system dynamics on the throughput that each client gets from every BS. In our prior work [16], we addressed the problem of max-min fair resource allocation in HetNets. However, even with opportunistic centralized network supervision over autonomous resource allocation at each BS we could not optimally solve the problem. Here, we focus on the PF objective, which is commonly implemented in BSs, and show that we can optimally solve the problem in a purely distributed manner. Other works have built testbeds to evaluate the over-the-air performance of MAC-level cross-RAT throughput aggregation [17, 18, 19, 20]. All these works have relied on conventional scheduling algorithms on each BS and focused on higher layer transport and application performance. We experimentally show that replacing the conventional resource allocation algorithms with AFRA can substantially increase the system throughput and fairness.

Distributed Network Utility Maximization (NUM). There is a large body of general results on the mathematics of distributed computation, some of which are summarized in standard textbooks such as [21, 22]. More recently, the framework of NUM [7, 8, 23] has emerged as a mathematical tool to optimize layered network architectures. The framework allows for decomposition of a global optimization problem into subsets of local problems that are carried out distributedly and implicitly solve the global NUM problem. We have derived an alternative distributed algorithm (named DDNUM) by leveraging dual decomposition and the NUM framework. We will show through simulations that DDNUM is 2-3 times slower than AFRA (in terms of convergence time) and increases the over-the-air signaling overhead by 4-5 times. These disadvantages, coupled with the increased client side computational complexity and lack of compatibility with conventional BSs, make NUM-based algorithms impractical for multi-RAT traffic aggregation.

Iii System Model

We discuss the system model and the resource allocation algorithm that is autonomously executed by each BS.

Iii-a Network Model

We consider a HetNet composed of a set of BSs = and a set of clients = . Each BS has a limited transmission range and can only serve clients within its range. Each client has a client-specific number of RATs, and therefore has access to a subset of BSs. We model clients that can aggregate traffic across BSs of the same technology (e.g., LTE DC) with multiple such RATs. Fig 1 shows an example HetNet topology. We assume that clients split their traffic over the BSs and focus on the resource allocation problem at each BS. It is itself a challenging problem to determine which BS to associate with among same technology BSs (e.g., choosing the optimal LTE BS if a client has an LTE RAT). We assume there exists a rule to pre-determine client RAT to BS association. The pre-determination rule could for instance be any load balancing algorithm [24, 25], or based on the received signal strength. Similar to [13, 14, 15, 16, 24], we assume that the transmission in one BS does not interfere with an adjacent BS. This can be achieved through spectrum separation between BSs that belong to different access networks and frequency reuse among same technology BSs.

Fig. 1: A heterogeneous network with 4 access technologies. Each client is in the coverage area of a group of BSs (dotted lines) and can split or aggregate its traffic across the corresponding BSs (RATs). The 3GPP is actively developing several new RATs for both sub-6 GHz and mmWave bands, re-emphasizing the heterogeneity of future wireless networks.

Iii-B Throughput Model

We consider a multi-rate system and use to denote the PHY rate of client to BS . Since each BS generally serves more than one client, clients of the same BS need to share resources such as time and frequency slots (e.g. in 3/4/5G) or transmission opportunities (e.g. in WiFi). The throughput achieved by client from BS thus depends on the load of the BS and will be a fraction of . We assume that each BS employs a TDMA throughput sharing model222In Section VI-A, we discuss how we can extend our model and algorithm to capture practical implementation issues such as WiFi contention. and let denote the fraction of time allocated to client by BS . Hence, the throughput achieved by client from BS is equal to and its total throughput across all its RATs would be


The total amount of time fractions available to each BS cannot exceed 1. Thus, for the to be feasible we have

and : Set and number of all clients in the network
and : Set and number of all BSs in the network
: PHY rate of client to BS
: maximum PHY rate across all clients and BSs
: non-zero minimum PHY rate across all clients and BSs
: Fraction of time allocated to client by BS
: Vector of s across all clients and BSs
: Total throughput of client across all its RATs
: A positive number that represents client ’s weight or priority
: Water-fill level at BS
TABLE I: Main Notation

Iii-C Background: Conventional PF Allocation in a Single BS

We first describe the basics of the PF resource allocation that is conventionally implemented in today’s BSs. Consider a network topology consisting of only a single BS j and clients. Let denote the throughput of client and a positive number that denotes its weight (or priority). A widely used objective function for PF is to maximize  [11, 12]. It represents a tradeoff between throughput and fairness among the clients. Let denote the time fraction allocated to client by BS . To maximize the PF objective function, the BS needs to solve the following problem


Problem can be easily solved through a simple algorithm. The Lagrangian of can be expressed as


where is a constant number (Lagrange multiplier) chosen to meet the time resource constraint. Differentiating with respect to time fraction resource and setting to zero gives


Since the sum of time fractions at optimality is equal to 1, we can conclude from Eq. (5) that . With known and , we can derive s from Eq. (5).

Now, let be defined as . Leveraging Eq. (5), we have


Eq. (6) has an interesting water-filling based interpretation: the time allocated to each client is such that the throughput of the client divided by its PHY rate times its weight is the same across all clients. We refer to this ratio (i.e., ) as the water-fill level of BS . In the next section, we will turn this observation in a single BS into a distributed resource allocation algorithm in HetNets.

Iii-D Distributed Resource Allocation in HetNets

There are two approaches to designing a resource allocation algorithm for generic HetNets. One approach, as we show in the Appendix, is to extend the formulation in to include multiple BSs and client RATs, and use dual decomposition to derive a distributed algorithm. This approach converges to the optimal solution; however, the Lagrange multipliers across BSs would no longer correspond to BSs’ water-fill levels. The second approach is to directly generalize the water-filling interpretation to derive an alternative algorithm, which still converges to the optimal solution (Section V) with far less overhead, convergence time, and complexity than the dual decomposition based algorithm (Section VI-C).

From Eq. (6), we observe that in a network with only a single BS, the BS allocates its time resources so that the clients who get the time resources reach the same water-fill level (i.e., throughput divided by ). Thus, in generic HetNets, if each BS considers the total throughput of each client across all its RATs (i.e., ) divided by in its water-fill definition, this should lead to a fair distributed algorithm. In other words, each BS should share its time resources across its clients such that: (1) all clients who get the time resources reach the same water-fill level at BS (i.e., ), and (2) if a client (e.g., ) does not get any time resources from BS , its is greater than . Fig. 2 illustrates this operation.

Fig. 2: There are 4 clients with non-zero PHY rates to BS . Blue boxes denote contributions to by BS (when it allocates time resources) and white boxes show contributions to it by other BSs. BS allocates its time resources so that all clients that get resources achieve the same water-fill level (). Clients that do not get any resources from BS have a higher than . Client is one such client in this example.

We next turn this idea into a distributed resource allocation algorithm. Consider slotted time for now. Algorithm AFRA (Fig. 3) summarizes the steps that are autonomously executed by each BS . There are three main steps in the algorithm: (i) clients are sorted based on the total throughout they receive from other BSs () divided by (Line 3), (ii) BS finds the water-fill level () and allocates the time resources accordingly (Line 4), and (iii) finally we introduce a randomization parameter to limit concurrent resource adaptation of a single client by multiple BSs (Line 5).

Fig. 3: Resource allocation algorithm autonomously run by each BS .

We next elaborate on how each BS finds its water-fill level and its clients’ time resource fractions (Line 4). Let denote the number of clients such that . Let denote the total throughput of client from all BSs other than . Consider an ordering in clients’ according to Line 3 of AFRA. In order to solve the water-fill problem (i.e., Line 4 of AFRA), we need to find the water-fill level , client index , and time fractions s such that


We can find these variables with a simple set of linear operations. First, we can find by checking a set of inequalities

In the first inequality, we first check if . If this is true, from Eq. (7) we conclude that client 2 would have a higher than even if BS allocated all its time resources to client 1 (i.e., to the client with minimum across all clients). As a result should be equal to 1. This procedure (and logic) is continued until is found.

With known , we can find by combining Eqs. (7) and (9) and solving the following linear equation


With known and , the s can be found from Eq. (7).

AFRA’s Computational Complexity and Message Passing Overhead. We calculate AFRA’s computational complexity in finding the new time resource fractions (s) for a BS . Let denote the number of clients with non-zero PHY rates to . The complexity of sorting clients (Line 3) is . The complexity of finding the water-fill level and the new time resource fractions (Line 4) is (with a binary search to find ). Thus, the overall computational complexity is . If we assume that each client has on average RATs, then on average would be equal to . Thus, the computational complexity would also be equal to .

Each BS uses the total throughput of each client across all its RATs in its calculations to find the water-fill level and the new s. Each time a client’s time resource (and hence total throughput) is changed, the client needs to inform all BSs to which it is connected about its new total throughput. Thus, the total message passing overhead generated by clients of a single BS is at most equal to , or alternatively .

Iv Convergence and Speed of AFRA

In this section, we investigate the convergence properties of AFRA. We first show that as BSs autonomously execute AFRA, the system converges to an equilibrium. Next, we investigate the convergence time properties of AFRA and provide tight bounds to quantify it.

Iv-a Convergence to an Equilibrium

Before we discuss convergence, we present a formal definition of an equilibrium.


Equilibrium: The vector of time fractions across all the BSs and clients is an equilibrium outcome if none of the BSs can increase its water-fill level through unilateral change of its time resource allocations.

Our next theorem guarantees the convergence of AFRA.


Let each BS autonomously execute AFRA. Then, the system converges to an equilibrium, i.e., and , , and .


Let denote the vector of time fractions (s) across all clients and BSs, and be the potential function. A potential function [26] is a useful tool to analyze equilibrium properties, as it maps the payoff (e.g., throughput) of all clients into a single function.

Since the number of clients and BSs is finite, is bounded. The key step to prove convergence, is to show that each time a BS adjusts its time fractions (i.e., s), the potential function () increases. This property coupled with ’s boundedness guarantees its convergence. We will show later in Eq. (15) that the change in potential function is proportional to the product of the change in water-fill levels and the change in s. Since converges (i.e., its variations converge to 0), one or both of these terms should converge to 0. Either of these conditions guarantee the convergence of the s (and hence, s and s).

Next, we show that each time a BS runs AFRA, increases. When a BS runs AFRA, it takes some time resources from clients with high and distributes them across clients with lower values. To ease the proof presentation, we focus on two clients and follow the changes on as the BS adjusts the s dedicated to these clients.

Let, denote two clients who are currently receiving time resources from BS . Assume the following initial (old) order between these two clients


Therefore, as BS executes AFRA it changes the time resources from and to and , respectively. This, only changes the two corresponding terms in the potential function, i.e.


Let denote the variation in potential function, i.e.


Thus, to prove convergence, we need to prove that is always positive. We prove this by showing that first . This shows that is always non-decreasing. Second, we show that is positive for very small values of . Now


Here and are the new throughput values for clients and , respectively. It is clear that after BS adjusts the time resources, we still have . This is because after BS reduces , would be either equal to the new water-fill level or higher than it (if ). On the other hand, would be equal to the new water-fill level. As a result, the final term in Eq. (14) is non-negative. Finally, is greater than zero for small values of because


The last term in the above equation is due to Eq. (11).

Iv-B Convergence Time

Before we can derive a bound on convergence time, we need to define a discretization factor on the time fractions (i.e., s). This technicality is due to the fact that s in our model are continuous variables, which can cause some BSs to continuously make infinitesimal adjustments to them. These adjustments converge to 0 as time goes to infinity.

In practice, operations always happen in discretized levels. For example, consider the following discretization policy:


Discretization Policy: During water-fill calculation by a BS j in AFRA, the time fraction allocated to the client with minimum should increase by at least . Otherwise, the BS would not update its time fractions.

Based on the above discretization policy, we can derive the following bound on the convergence time.


Consider a HetNet with N clients and M BSs. Then, the number of steps that it takes for AFRA to converge is upper bounded by O().


Let be the potential function from the proof of Theorem IV-A. To compute a bound on the convergence time, we study the increments of . The key step is to find a lower bound on ’s increments. Since increases whenever a BS makes adjustments to its s, the convergence time is then upper bounded by the difference between the maximum and minimum possible values of divided by the lower bound on ’s increments.

We take the following steps to find a lower bound on the potential function’s increments. Let denote the set of clients with non-zero PHY rates to BS and assume the following initial (old) order among the clients


When BS executes AFRA, it adjusts the time fractions in a way that increases the time resources allocated to client . Let denote the increase in client ’s time resources and its new throughput. Let denote the change in client ’s () time resources and its new throughput. Hence, we have


However, even after BS adjusts its time resources, would still have the minimum across all clients. This is due to the water-fill based operation in AFRA. As a result


Next, we find a lower bound on the potential function’s increments


Let and . Since the logarithm is a concave function, from Jensen’s inequality [27],


Leveraging Eq. (23), we conclude that Eq. (22) is


where and . Note that since we seek an upper bound on convergence time, we can choose a small enough so that . These assumptions increase the upper bound but allow us to use the Taylor series in Eq. (24). If we let and denote the minimum and maximum PHY rates across all the clients and BSs, then we have


V Optimality of AFRA

Beyond convergence, we study the optimality properties of AFRA’s equilibria. We first derive some useful properties of the equilibria that we leverage for optimality analysis. Next, we prove that the equilibria also maximize the global proportional fair resource allocation problem across all the BSs, and hence are globally optimal. Finally we discuss the uniqueness of the equilibria and prove that while the equilibrium throughput vector across all the clients is unique, there could be infinitely many resource allocations that realize this outcome. For simplicity, we do not consider discretization in this section.


Consider an equilibrium outcome of AFRA. Let denote the throughput of client i, the water-fill level of BS j, and the fraction of time allocated to client i by BS j. Then





Part 1. From the water-fill definition we have



follows from Eqs. (26) and (27).

Part 2. Every BS can always increase its water-fill level by distributing its unused time resources across its clients. The property follows, since at equilibrium the water-fill levels cannot be further increased.

Part 3. We leverage


to derive property

as follows


We next show that any equilibrium outcome of AFRA is globally optimal, i.e., it maximizes the global PF resource allocation problem.


Consider an equilibrium outcome of AFRA. Then, the equilibrium outcome also maximizes the global PF resource allocation problem, i.e., it maximizes () subject to the feasibility constraints in Eqs. (1)-(3).


Let and denote the throughput of client and water-fill level of BS at an equilibrium, respectively.

We prove that for any feasible selection of s (i.e., s that satisfy the feasibility conditions in Eqs. (2) and (3)) and the corresponding clients’ throughput values (i.e., s as defined in Eq. (1)) we have


Define . Eq. (29) can then be proved through the following inequalities by leveraging properties


from Theorem V:


In our last theorem we prove that while the equilibrium throughput vector across all clients is unique, there could be infinitely many resource allocations that realize this outcome.


Let = denote the vector of throughput rates across all clients at an equilibrium. Then, is unique. However, there could be infinitely many resource allocations across the BSs that realize .


Part 1. We first prove that is unique. Let maximize the global proportional-fair resource allocation across all clients and assume is a different equilibria. From Theorem V, we know that every other equilibrium should also maximize the global PF resource allocation. This means that all inequalities in Eq. (30) should be equalities for any equilibrium, including . Now, for the first inequality to be an equality (i.e., Jensen inequality of Eq. (30)), the following condition needs to be satisfied [27]


Further, since , we conclude that


Part 2. To prove that there could be infinitely many resource allocations that realize , we provide an example. Consider a topology with two BSs (, ) and two clients (). Let , , and . Then, is maximized by the following time fractions for any .


Here, irrespective of , and .

Vi Performance Evaluation

Fig. 4: We use two WARP boards to construct two BSs in our testbed. The BSs are connected to a server through Ethernet. The server runs a single fully-backlogged DL UDP iPerf session to each client. A sublayer implementation below the IP layer at the server, selects the BS for each packet of every traffic flow. The clients (not shown in the photo) have access to both radios, and remain static and connected to both BSs throughput the experiments (a); Cellular TDMA and WiFi MACs. The PHY header and ACKs are sent at a fixed transmission rate. Clients embed the throughput they receive from other BS in their ACK packets. The MAC header and payload are transmitted at a variable transmission rate. We define as the total number of payload bits divided by the total time it takes to successfully transmit a packet. We replace all s in AFRA with to derive the s and determine the number of packets that should be served from each queue (b); Total throughput across the two clients for four schemes: WiFi only (WiFi), Cellular only (Cellular), AGG-RR, and AFRA. AFRA achieves a higher average total throughput (29 Mbps vs 20 Mbps) and PF index (2.3 vs 1.97) compared to AGG-RR(c); Per-client throughput values for both AFRA and AGG-RR (d).

In this section, we evaluate AFRA’s performance through experiments and simulations. First, we investigate the benefits of MAC level traffic aggregation in a small testbed composed of four SDR (software-defined radio)-based BSs and clients. Next, we conduct simulations to evaluate AFRA’s equilibria properties as we scale the number of clients and BSs. Finally, we compare AFRA’s speed and over-the-air signaling overhead against DDNUM, a dual decomposition based algorithm that we derived from the NUM framework.

Vi-a SDR-Based Implementation and Real-World Performance

Implementation. We construct a HetNet topology composed of a WiFi BS, a cellular BS, and two clients. The two BSs are physically separated from each other and are placed in an indoor lab environment (Fig. 4(a)). We use a WARP board [28] with 802.11a reference design as our WiFi BS. We use another WARP board with OFDM PHY (WARP OFDM reference design) and a custom TDMA (Time Division Multiple Access) MAC to mimic a cellular BS. We use two other WARP boards to construct our two clients. Each client has access to both WiFi and cellular radios, and remains static and connected to both BSs throughout the experiments.

A server running iPerf sessions is connected to both BSs through Ethernet. For each client, the server generates a single fully-backlogged UDP traffic flow with 500 byte packets. We implement a below-IP sublayer to split this traffic flow between the two BSs. This sublayer is responsible for selection of the BS to be used for each packet, and acts similar to the LWA Adaptation Protocol (LWAAP) in the LWA standard [2]. In our implementation, we sequentially iterate between the WiFi and cellular BSs to route the packets of each traffic flow.

AFRA, as presented in Section III-D, does not account for various types of overhead (e.g., PHY/MAC header, ACKs, idle slots, collisions) that exist in PHY/MAC protocols. To address the issue, we introduce the notion of effective rate () and replace all s in AFRA with s. For a single packet, can be calculated as the number of bits in the packet divided by the total time it takes by a BS to successfully transmit that packet (including all overhead). In our implementation, each BS keeps track of the total time spent in successfully transmitting the past 5 packets of each traffic flow (i.e., the past 5 packets of each client) to calculate its . The averaging over 5 packets is to account for channel fluctuations in our experiments, and can be adjusted based on the client mobility.

We implement the following mechanisms: (i) WiFi only: the cellular BS is off but the WiFi BS is active; (ii) Cellular only: WiFi BS is off; (iii) AGG-RR: this scheme uses aggregation but with a round robin (RR) scheduler at the WiFi BS and conventional PF MAC at the cellular BS. With the RR scheduler, the WiFi BS maintains a different queue for each client and sequentially serves a single packet from each queue at every round. With the PF MAC at the cellular BS, the BS dedicates its time resources to each client according to Section III-C (single BS PF); (iv) AFRA: each BS uses its calculated s to determine the number of packets that should be served from each queue in WiFi and the number of time slots that should be dedicated to each queue (client) in cellular, at every round. In our implementation, both clients’ are equal to 1 and the BSs updates their s every 5 ms.

Fig. 5: AFRA’s performance evaluation results. Average number of steps to convergence as a function of number of clients (a) and number of BSs (b). Evolution of potential function for two simulation scenarios one with M=10, N=10 (c) and the other with M=10, N=20 (d). Each Run in these figures corresponds to a different simulation realization. In the priority curves (solid black curve with * markers), the BS with the highest local increase in potential function gets priority in executing AFRA. Leveraging this policy reduces the average convergence time by more than 30%.

Performance Results. Fig. 4(c) shows the performance of the four schemes. In both the WiFi only and Cellular only options, only a single BS is active throughout the experiments. We observe that the Cellular only scheme provides a higher sum throughput than the WiFi only scheme. With careful evaluation of packet transmission traces, we discovered that this higher throughput is primarily due to the corresponding MAC protocols. In particular, WiFi MAC provides the same transmission opportunity to each traffic flow (client). As a result, the client with lower PHY rate occupies the channel for a longer duration that the other client. This decreases the throughput for both clients. In contrast, the cellular TDMA MAC provides the same transmission time for both clients (with 2 clients, single BS PF equally divides the time between the clients (Eq. 5)). As a result, the throughput of the client with higher PHY rate does not drop because of the client with a lower PHY rate. This, along with other MAC issues such as WiFi contention reduce the WiFi only throughput.

Fig. 4(c) also shows that the two RAT aggregation schemes (AGG-RR and AFRA) can successfully aggregate WiFi and cellular capacities and provide a higher sum throughput than the WiFi only and Cellular only options. Further, AFRA increases the average total throughput by 45% (from 20 to 29 Mbps) with 18 and 11 Mbps per-client total throughput values (per-client throughput plots are shown in Fig. 4(d)). Let us define the proportional fairness index as PF = ( is the total throughput of each client across its RATs in Mbps). Then, the PF index in AFRA would be 2.3. With AGG-RR, the per-client throughput rates drop to 12.5 and 7.5 Mbps. Thus, the PF index reduces to 1.97. AGG-RR uses the conventional scheduling algorithms on each BS (i.e., it uses RR in WiFi and single BS PF in cellular), which reduce both the sum throughput and the PF fairness index.

Vi-B AFRA’s Equilibria Properties

Setup. We simulated network deployments with N clients and M BSs to evaluate AFRA’s equilibria properties as we scale the number of clients and BSs. All clients’ s are equal to 1. Half of the BSs are WiFi and the other half are cellular. Each client has access to 4 RATs, two WiFi and two cellular. The PHY rates for the WiFi and cellular RATs are randomly selected from the sets Mbps and Mbps, respectively. In each simulation realization, we randomly associate clients’ RATs with BSs. Next, we run AFRA until an equilibrium is reached. We set the discretization factor equal to 0.05, i.e., a BS adjusts its time fractions only if the increase in time fraction (i.e., ) at its client with minimum is greater than or equal to 0.05. For the initial allocation, each BS equally divides its time across its clients. Unless otherwise specified, each of our simulation points is an average of 100 simulation realizations.

AFRA’s Convergence Time. Figs. 5(a) and 5(b) depict the impact of the number of clients and BSs on AFRA’s convergence time. In each of these figures, we count the number of steps until convergence is reached. At each step, a single BS that needs to adjust its time fractions is randomly selected. In Fig. 5(a), we vary the number of clients from 10 to 100 and plot the corresponding convergence times for three different M values: 10, 20, and 50. We repeat this simulation by changing the N and M variables and plot the corresponding results in Fig. 5(b). From these two figures, we observe that time to convergence is highest when the number of clients is between one to two times the number of BSs. As the ratio between the number of clients and BSs (i.e., ) leaves this range, the convergence time rapidly drops and then stabilizes. The results show that AFRA requires a small number of steps to reach an equilibrium.

Policies to Further Reduce AFRA’s Convergence Time. Our next goal is to design policies that can further reduce AFRA’s convergence time. To gain intuition on how to design such policies, we simulated a topology with 10 clients and 10 BSs and plotted the evolution of the potential function (i.e.,