Millimeter-wave (mmWave) communication fulfills the demand for multi-gigabit-per-second (Gbps) throughput and low-latency communication even for extremely dense networks , which are usually not easy to sustain with traditional communications operating at sub-6GHz frequencies. Despite its benefits, mmWave communication suffers from very high attenuation, resulting in dramatic penetration loss, due to its high frequency. To compensate for this loss, directional transmissions are typically employed, where the coverage of communication is constrained to a rather small area, e.g., to the line of sight in the extreme case. This limitation poses new challenges in particular to guarantee efficient content dissemination for various delay-sensitive multicast applications (e.g., raw sensory data broadcasting in vehicle-to-everything (V2X) communications to support autonomous driving, high-definition video broadcasting in a concert hall, and public-safety use cases).
Although multicast scheduling has been widely explored for networks operating at sub-6GHz frequencies, the specific benefits and challenges of mmWave multicast scheduling remain understudied . In particular, multicast scheduler designs for sub-6GHz communications assume the availability of omnidirectional transmission, and thus a source node can schedule the transmission to any arbitrary subset of receiving nodes within a certain range simultaneously. However, the restricted coverage of mmWave communication undermines this assumption and renders these designs inapplicable, opening a new research question.
One trivial design for mmWave multicast scheduling can simply employ multiple directional unicast and/or multicast transmissions to sequentially serve all multicast nodes. The behavior of such a scheduler is illustrated in Fig. 0(a), where the source node (labeled as ) transmits sequentially in sectors to to serve multicast nodes , , , , and , respectively. One can easily observe that this trivial design is extremely inefficient and a straightforward improvement can be applied if we consider beam grouping based on adaptive beamforming [3, 4, 2]. As shown in Fig. 0(b), nodes that are closer to the source nodes (i.e., , , and ) are served together with a wider beam, while the father nodes (i.e., , and ) and the nodes that are not in proximity with the other nodes (i.e., ) with narrower beams. Although adaptive method provides higher flexibility in grouping the receiving nodes, it however comes at the expense of more complex beamforming and costly antenna architecture.
The above designs rely only on single-hop transmissions, which can be problematic in many practical scenarios. More specifically, there might exist nodes that are not reachable by the source or nodes that are not feasible for high transmission rates due to large distance (i.e., node in sector ) or the presence of blockages [5, 6]. In such cases, a relay-aided transmission is inevitable to ensure reachability and guarantee high-performance multicasting (in terms of throughput and delay). With relay enabled, a node can serve as a transmitting node as soon as it receives the data from another node. As shown in Fig. 0(c), upon receiving data from node in the first time slot, nodes , and act as the relay node for node , and , respectively. With this flexibility, we can break down a low-rate multicast transmission into a combination of multiple high-rate unicast and/or multicast transmissions that can be scheduled separately. Interestingly, we can then leverage the limited coverage of directional transmissions in mmWave due to the significantly increased spatial gain brought by significantly reduced interference among concurrent (unicast or multicast) transmissions; in Fig. 0(c), links , , and occur simultaneously.
We believe the optimal performance of mmWave systems should jointly exploit all these properties of mmWave communication, namely relaying and spatial sharing. Thus far, the existing works have considered single aspects, but never jointly. This motivates us to design new mmWave multicast scheduling algorithms integrating both relaying and spatial sharing. Unsurprisingly, the joint optimization is complicated and the specific challenge resides in designing efficient communication group composition and spatial sharing scheduling. With both spatial and temporal factors involved, the relay nodes have to be determined gradually and the source and the relay nodes have to select carefully their target nodes depending on how the communication will affect the total completion time. This situation becomes even worse when only limited knowledge about the behavior of the other node with concurrent transmissions is available.
To address these challenges, we provide a comprehensive model and an integer linear program (ILP) to characterize the problem, with the objective of minimizing the multicast completion time (i.e., the time required for all nodes to receive the intended data). The ILP aims to find the optimal scheduling policy that determines the transmitting nodes and their corresponding receivers at each time slot. Specifically, it jointly minimizes the duration of each time slot accounting for all concurrent transmissions111In mmWave communication systems, the terminology of spatial sharing is also commonly referred to as concurrent transmission. In this paper, these terms are used interchangeably. while selecting the optimal relay node. Exploiting spatial sharing in the relay-aided multicast transmission requires careful scheduling, both spatially and temporally, which is usually not of concern in the conventional multicast. Hence, the problem formulation for directional multicasting is significantly different and inherently more complicated than that of the conventional multicast scheduling in the literature. Ultimately, solving the ILP provides a tight lower bound for the multicast completion time in a mmWave network leveraging both relaying and spatial sharing gains.
To account for the deployment in real-world scenarios in equipment with computational power constraints and to ensure scalability, we further present a lightweight distributed algorithm, namely mmDiMu. The high-level idea is to exploit concurrency by allowing each transmitting node to autonomously decide and transmit to its target node(s), regardless of the other concurrent transmissions in the network. The set of target nodes for each transmitting node is determined based on the physical distance of nodes and is updated after every transmission time slot.
The following summarizes the contributions of this paper:
We identify the challenges and opportunities in mmWave multicast scheduling and provide an ILP formulation that finds the optimal scheduling policy by jointly leveraging relaying and spatial sharing gains.
Due to the exponential complexity of the ILP-based solution (namely ILP), we propose mmDiMuheuristic – a scalable distributed mmWave multicast scheduling algorithm. This lightweight algorithm has significantly lower complexity, and is more practical than ILP.
We perform extensive simulations to validate the performance of our algorithm in both low- and high-density networks. As expected ILP demonstrates a substantial gain in completion time as compared to all other algorithms. While there is a slight gap between mmDiMu and ILP solutions, we can observe a significant improvement over the existing algorithms, i.e., FHMOB in , and OMS in  for sub-6GHz and the adaptive beamwidth algorithm (i.e, Adapt) in  for mmWave, which to the best of our knowledge represents the state of the art.
We evaluate interference imposed on unintended receivers by the proposed algorithm and show that the impact of interference is marginal even for high-density scenarios.
We also provide valuable insights on the design of a mmWave multicast system and design guideline depending on the network’s density and system configurations.
The rest of this paper is organized as follows. In Section II, we present the state of the art for multicast scheduling algorithms. Section III includes a description of the system model and its problem formulation. The optimal solution (i.e., based on ILP) is presented in Section IV-A, and Section IV-B presents a lightweight heuristic. The performance evaluation is presented in Section V. In Section VI, we discuss other important aspects to design mmWave multicasting and Section VII concludes our paper.
Ii Related Work
As a key technology for beyond-5G networks, mmWave has been considered for many emerging applications (e.g., autonomous driving, public safety, and mobile video streaming) that typically require the distribution of data in large volume with low latency. Unfortunately, directional mmWave links suffer from limited coverage, and it complicates multicasting. Many existing works on mmWave mainly focus on unicast transmissions. With that said, the challenges and benefits of mmWave multicast remain understudied. In this section, we present the state of the art of multicast techniques for both sub-6GHz and mmWave networks, while differentiating them with our proposed approach.
Ii-a Sub-6GHz multicasting
The most basic type of multicasting is broadcast, in which all nodes are served simultaneously. In this case, the transmit rate is limited by the node with the worst channel quality. Improving over this basic technique, many opportunistic multicast techniques are proposed in [9, 10, 11] and the references therein. These techniques exploit multiuser diversity by opportunistically transmitting to an arbitrary subset of the nodes with better instantaneous channel quality. As a result, they outperform the broadcast scheme and achieve higher throughput. However, this technique still suffers from poor performance when the network has nodes located at its edge. In the extreme case (i.e., when many nodes are located at the edge), it performs similarly to a broadcast scheme.
Overcoming the above issue, the research community has explored multicast beamforming. Multicast beamforming uses the beamforming technique that focuses the transmit signal power at only one direction of interest by adjusting the antenna gains. As a result, it improves the signal-to-noise ratio (SNR) of the nodes in that direction. Authors in  publish one of the first work on improving the system throughput with this technique. They first use omnidirectional multicast to transmit to nodes with better channel quality and then use directional multicast to transmit sequentially to the remaining nodes. To further improve the system performance, a better method applies beamforming weights at the antenna leading to the maximization of the worst SNR, at the expense of degrading the SNR of other nodes (i.e., the nodes that are located closer to the transmitter). Many research works demonstrate this technique yields a high system throughput [8, 13, 14] and minimizes completion time [15, 16, 7].
The aforementioned works mainly focus on scheduling the subset of nodes in a system to achieve the intended goal, where neither coverage nor blockage is an issue. Specifically, a source node can simultaneously transmit to any arbitrary subset or even all nodes if desired. Nevertheless, operating at high frequency, mmWave communications are prone to extremely high attenuation and penetration loss. Furthermore, the use of directional transmission (which only covers a small angular area) makes it impossible to serve any arbitrary nodes in the system simultaneously. As a result, the multicasting techniques designed for sub-6GHz communication yield suboptimal performance for mmWave communication. To shed light on this aspect, we specifically benchmarked the performance of our proposed algorithms to two seminal multicast schedulers used in sub-6GHz systems (i.e., in [8, 7]) in Section V.
Ii-B mmWave multicasting
An initial work addressing the need for the redesign of mmWave multicast scheduling is presented in  where the authors emphasize on the use of adaptive beamwidth to improve the grouping of the multicast nodes to achieve higher throughput. Similar work is presented in  where the authors investigate the trade-off between transmission beamwidth and achievable SNR to ensure high throughput. These schedulers may require a high level of beamwidth adaptation to form arbitrary beams to provide coverage to the multicast nodes. Therefore, this design increases the complexity and the cost of the antenna design. In contrast, with a highly reduced complexity, the authors in  present a practical IEEE 802.11ad compliance approach where a codebook-based scheduler with one radio frequency (RF) chain is applied.
All above-mentioned works consider only single-hop multicasting in which the multicast transmission rate remains limited to the nodes located farthest from the source node without leveraging spatial sharing. Later, the benefits of relay and spatial sharing are separately considered in  and  to improve the multicast rate and spectral efficiency, respectively. In , the authors exploit relaying only to overcome non-line-of-sight paths, but not for performance optimization. In , the authors leverage spatial sharing in which they enable the simultaneous transmission of single-hop unicast and multicast sessions to increase network efficiency.
To sum up, all the works mentioned above works either consider multi-hop relay or optimal spatial sharing, but not jointly. To the best of our knowledge, we are the first to jointly consider both to minimize the data delivery time for mmWave multicast communications.
Iii System Model and Assumptions
We consider a mmWave network composed of randomly distributed nodes denoted by set , where node represents the source and the other nodes are interested in receiving data of size from the source. We assume relaying is enabled in the network, meaning that all the nodes, once receiving the data, can transmit the data to other nodes. We consider a time slotted system where the number of time slots for multicasting the data is denoted by variable , and the set of time slots is given by . The length of each time slot is not necessarily equal, but we ensure that transmissions happen only within one-hop at each time slot. To exploit spatial sharing, multiple concurrent transmissions can coexist at each time slot.
We call a node that transmits data to other node(s) a parent node (PN), and we denote by the set of PNs at time slot . Inversely, a node that receives data is called a child node (CN), and we denote by the set of CNs of PN at time slot . A node can serve as PN in multiple time slots, and the data has to be completely delivered to all its CNs in each of these time slots. Therefore, we have for any and . Each node in the network has a fixed transmit power and equal-width orthogonal lobes numbered counterclockwise starting from , denoted by . For each node , we denote by the set of nodes that are within the coverage of lobe of the node. For example, we have and in Fig. 2. Note that, as the lobes are orthogonal, a node can activate more than one lobe simultaneously.
We adopt a path-loss model used in  (will be detailed in Section V-A), and the received rate is computed using the Shannon capacity model from . We denote by the SNR of the signal received at CN , transmitted from PN . A node is called a target node (TN) of a PN if its received signal has the lowest SNR as compared to the other CNs within the same lobe of the PN. In fact, the nodes with SNR worse than that of the TN are assumed to be unable to decode the message transmitted by the PN. Note that there is at most one TN in each lobe for a PN. We denote by the set of all TNs of PN at time slot . Given , the set can be formally defined as
Note that means that node does not transmit at time slot and means that node steers its beam towards all directions, where gives the cardinality of a set. Since the TNs experience the worst channel conditions in comparison to other CNs within the same lobe, the maximum rate which determines the transmission time of a PN depends on the SNR of the set of its TNs . Given the TN set of a PN , finding the optimal transmitting rate for the PN is as discussed in . Our focus is on obtaining the optimal for each node at each time slot . Note that activating more lobes simultaneously results in lower transmission rate. Let be the optimal transmit rate. The time required for PN to complete the data transmissions to all its TNs (including its CNs) at time slot is given by
At each time slot, multiple PN can transmit simultaneously, exploiting spatial sharing. As a result, the duration of a time slot is determined by the longest transmission at the time slot, that is,
Our objective in this work is to minimize the total duration of all the time slots in , namely multicast completion time, by jointly minimizing and , and to determine the set of PNs and their corresponding CNs in each time slot. The completion time can be expressed by
The following constraints should be considered. First, all nodes have to receive the data within time slots, i.e.,
Then, a node can only transmit data to other nodes if it has already received the data, i.e.,
Iv Proposed Approaches
In this section, we describe our solutions to the min-time mmWave multicast scheduling problem. We first provide an ILP formulation that gives an optimal schedule, and then we propose a more scalable distributed algorithm.
Iv-a Optimum Solution by ILP
(target vector of PNin time slot ): a binary vector in which is the transpose operator and if node is a TN of PN in time slot . For example, in Fig. 2, nodes and are the TNs of the source in the first time slot, and hence the target vector is . There are possible combinations for a target vector for each PN.
(target matrix): a binary matrix of size . Each of the columns of represents a possible choice for a target vector, where is a column of . In fact, is independent of the nodes, and it shows the state-space of the target vector . Precisely, where is a binary vector. We form by filling , , via the reverse (rev) of the -bit decimal-to-binary (dec2bin) conversion of the index . For instance, and in Fig. 2, based on the definition of TN in (7), we have .
(PN vector of PN in time slot ): a binary vector and . If node is a PN at time slot , then , otherwise, . Precisely, if PN chooses the -th column of as its target vector. Given , the TNs of PN is obtained by
(observation matrix): is a binary matrix of size , defined for every . For each node , indicates with which lobe can node cover the other nodes using a single-hop transmission. Precisely, if node is within lobe of node . For network in Fig. 2, we have
(CNs matrix): a binary matrix of size that shows if a PN transmits to its TNs, which of the other nodes fall within the coverage area of the PN. While represents the set of TNs of PN , defined in (7), corresponds to the set of CNs of PN , which can also be served given the TNs in . Let be the target vector of PN corresponding to the -th column of , then the elements of the -th column of , which are equal to 1, represent all the nodes which can be served by such a target vector. To clarify, let node be a target node of PN given corresponding to the -th column of . Based on the definition, since node as the TN of PN is always in , then, . Further, we have if , . Based on item (ii), for the source node in Fig. 2, we have which corresponds to . Given the PN vector , we denote all the CNs, covered by PN , by a binary vector where if node is covered by PN at time slot . is thus obtained by
(transmission duration): , a real-valued vector . If a PN chooses the -th column of as its target vector , then, shows the duration of transmission defined in (2).
Matrices can be calculated given the distribution of nodes in the network, while are to be found by the ILP. Using these terms, the ILP formulation is provided as follows.
As mentioned, in (10) is the decision variable, which determines the TNs of PN as in (7). (10b) expresses that the source node must transmits at , but not the other nodes. In the following time slots, any of the nodes in could be a PN given that it has received the data in any previous time slots ; the constraint in (10c) indicates this. Finally, (10d) guarantees that the number of TN in a lobe is at most one.
Regarding the complexity, ILP formulation is an NP-hard problem as a special case of the problem has been shown to be NP-hard . Although NP-hard, its running time depends on the number of integer variables. Our proposed ILP has variables, and thus a complexity of , which exponentially increases with . Clearly, the ILP-based solution has an exponential time complexity, and it can only be solved for very small problem instances (i.e., small ). For this reason, in the next section, we design a practical and lower complexity heuristic.
Iv-B Distributed Multicast Scheduling
Our distributed multicast scheduling heuristic, namely mmDiMu, accounts for both relay and spatial sharing. By having each PN deciding autonomously its CNs to transmit to, mmDiMu is scalable and distributed in nature as opposed to the centralized ILP solution. The pseudocode of the algorithm is as shown in Algorithm 1. In what follows, we elaborate on the detail of the algorithm.
We use to denote the set of waiting nodes that have not received the intended data. Initially, i.e., at the first time slot, node is the only PN in set , and we have . We use to denote the distance matrix, where represents the distance between nodes and . At each of the following time slots , we select for each node the PN in with the least distance . In the case where a node is equidistance from two or more PNs, it will randomly select one of the PNs. After this process, for each node we obtain its CN set at this time slot, and we apply the opportunistic multicast scheduling that maximizes the sum throughput to select the set of nodes from for PN to transmit to. The intuition lies in maximizing the achievable rate for each transmission session to promote minimum session transmission time, and thus resulting in minimum completion time. Once receiving the data, a node will be removed from the set and added to the PN set . The above process is repeated until all nodes receive the data. In each time slot, the time for each transmission is recorded as , where is the optimal. The multicast completion time thus can be calculated as .
V Performance Evaluation
This section evaluates the performance comparisons between the baseline and our proposed multicast algorithms.
V-a Simulation Setup
We consider a uniform and randomly distributed nodes within a mm area with the source node (i.e., PN ) located at the center. We adopt the mmWave path-loss model in , which is written as,
where is the distance between the PN and CN , is the carrier frequency, andin dB. The received rate is computed using the Shannon capacity model in . Table I summarizes the parameter values used in the simulator.
|Free space path loss ()|
|Carrier frequency ()||GHz|
|System Bandwidth ()||GHz|
|Transmit power||dBm |
|Path loss exponent ()|
|Standard deviation ()|
|Shannon capacity ()|
|maximum spectral efficiency|
|Frame size ()|
V-B Benchmarked Algorithms
This subsection highlights the different algorithms used in the performance comparison.
This is based on solving the ILP presented in Section IV-A. It selects the transmission at each time slot, which globally maximizes the spatial sharing gain while achieving minimum completion time . Therefore, it provides the lower bound for .
We solve the ILP by employing Gurobi222http://www.gurobi.com/ along with CVX333http://cvxr.com/ in MATLAB environment.
mmDiMu. This is our distributed algorithm that considers both relaying and spatial sharing. While suboptimal, mmDiMu scales well regardless of the network density. The detail of the algorithm is as presented in Section IV-B. Unlike ILP, mmDiMu uses a distributed approach, in which each PN makes the transmission decision autonomously.
OMS . This algorithm is a sub-category of a multicast with adaptive beamwidth scheduling algorithm. It provides optimal performance for multicast applications in conventional networks, capitalizing on the opportunistic gain. Essentially, OMS sorts the nodes according to their channel SNR and serves the subset of nodes that maximizes the instantaneous sum throughput.
FHOMB . Finite horizon opportunistic multicast beamforming (FHOMB
) is designed specifically to minimize the completion time when sending a finite number of packets to multicast receivers. At each time slot, a subset of nodes is selected such that the estimated completion time is minimized. The estimated completion time is obtained by maximizing the minimum rate using multi-lobe beam; this beam multicasts (usually at a low broadcast rate) to the remaining receivers.
Adapt . This is a scalable heuristic which groups the multicast nodes in subgroups using a hierarchical structure to construct the multicast tree. An example scheduling is as depicted in Fig. 0(b). Once the subgroups/beam are determined, the source node serves each multicast subgroup sequentially through the beams; the transmit rate at each beam is thus limited by the node with the lowest SNR within each beam.
V-C Evaluation Settings
To evaluate the performance of each algorithm, we examine the impact of two main parameters: (1) the number of nodes and (2) the beamwidth at the transceivers. Due to the high complexity of ILP, i.e., , is restricted to in scenarios where ILP is involved for comparison. The rest of the algorithms are evaluated for up to . We evaluate the performance for transmitter beamwidth . Note that, the transmit beamwidth has an impact on the transmission gain , which we account for in the computation of the receiving rate. Unless mentioned otherwise, at the receiver side, we assume that it uses a quasi-omnidirectional mode for receiving.
To ensure fair performance comparison between the algorithms, all algorithms use the same simulation setting. The minimum beamwidth is determined by in each simulation scenario in Section V-D, and the beamwidth resolution is thus multiple of for all the algorithms except Adapt. Since Adapt operates based on adapting its beamwidth to the multicast group, it can freely adjust its beamwidth as long as the minimum beamwidth is . For instance, when the simulation has a setting of , Adapt could have any beamwidths between and while the other algorithms could only have beamwidths that are a multiple of , i.e., .
We implemented all the algorithms in Matlab and conducted the comparisons using the above settings. For each data point, we average the data over simulation runs and compute the corresponding confidence interval.
V-D Simulation Results
As defined in (4) in Section III, the completion time is the time required for all network nodes to finally receive the multicast data (by summing up the duration at all time slots). Specifically, it is represented by the time, at which the last multicast node receives its data.
V-D1 Impact of the number of nodes
Here, we evaluate the impact of different , , on the completion time by fixing the transceivers beamwidth .
As a general trend, Fig. 3 shows that increasing the number of nodes also increases the completion time . When is large, the number of multicast slots required to transmit to all the nodes increases as well. ILP performs best as it picks the best policy which results in minimum , as formulated in (4). It indeed only requires , , and of the multicast completion time required by OMS, FHOMB, and Adapt, respectively, for . Specifically, ILP achieves a reduction in completion time by up to as compared to the other algorithms. Our proposed algorithm mmDiMu also demonstrates a high gain in completion time. It achieves completion time reduction of up to , , and over OMS, FHOMB, and Adapt, respectively.
Interestingly, while OMS performs well in conventional single-hop systems, it performs slightly worse than Adapt as increases. As increase, so as the SNR diversity of the nodes. In such a case, OMS will first opportunistically transmit to the node that has higher SNR. This behavior results in excluding the nodes with low SNR initially. As a result, it suffers from low transmitting rate at a later time; it still has to serve the remaining nodes that have lower SNR. Unlike OMS, Adapt groups the nodes based on angular and then divides the group to minimize the transmission time and form a binary tree structure. Therefore, it refrains from the suboptimality that comes from greedily scheduling the nodes with better SNR. On the other hand, OMS performs better than Adapt for smaller
because the probability of having nodes at the edge is much smaller. Furthermore,OMS may use more than one (disjoint) beam to serve all the nodes, while this option is unavailable in Adapt. Therefore, sparse distribution of nodes – this mostly occur when the node density is low (i.e., small ) – harms the performance of Adapt.
Similarly, FHOMB in  that performs well in single-hop multicasting, performs poorly here. In FHOMB, a node receives the complete frame over multiple fixed-length time slots. At each slot, the policy (i.e., the subset of nodes to transmit to) which gives the lowest estimated completion time (up to the time all nodes received the frame) is chosen. As mentioned, to determine the estimated completion time, the remaining nodes are served with broadcast. In mmWave networks, broadcasting in all direction results in a very low transmission rate. Therefore, the estimated completion time is significantly longer than a slot time. Here, lower estimated time is favored since it provides a lower total transmission time. In most cases, this comes at the expense of a long slot duration . As seen in Fig. 3, this results in high completion time.
As expected, mmDiMu performs worse than ILP because it autonomously schedules its transmission, disregarding the decision made by other PNs in the system. Let’s consider the scenario in Fig. 2 and the corresponding schedule in Table II. The completion time of ILP is s lower than that of mmDiMu. Since mmDiMu sorts the nodes according to their SNR, the parent for and is , and is served first. This results in s. However, ILP is aware that scheduling node first results in optimal completion time. As increases, the occurrence of this event increases as well. This reflects in the higher gain for ILP for larger .
Remark: The low complexity mmDiMu only requires additional completion time, in the worst case , as compared to ILP. Nevertheless, this additional time is significantly lower than that required by other algorithms.
V-D2 The importance of joint relaying and spatial sharing
The substantial gain in the completion time demonstrated by our proposed algorithms (i.e., ILP and mmDiMu) emphasizes the importance of leveraging the relaying and spatial sharing gains jointly in mmWave multicast networks. To shed light on this aspect, Fig. 4 and Fig. 5 depict the number and ratio, respectively, of the relay and concurrent transmissions for ILP and mmDiMu. A transmission is a relay transmission if the transmitter is not the source node. A transmission pair is defined as a concurrent transmission if there is more than one transmission within the same time slot. For instance, in Table II, the number of relay transmission is (i.e., and ), and the number of concurrent transmissions is (i.e.,