The broadcast nature of the wireless medium makes multicasting an efficient point-to-multipoint communication mechanism to deliver a same content concurrently to multiple interested users or devices. Recently, multicast services have been gaining increasing interests in cellular networks due to emerging applications such as live video streaming, venue casting, proactive multimedia content pushing, software updates, and public group communications . In conventional cellular networks, multicast services have been allocated different time or frequency resources from those allocated to unicast services and adopt single-frequency network (SFN) transmission, as in the 3GPP specifications known as LTE-multicast . However, such orthogonal resource sharing and transmission scheme has low spectrum efficiency and can significantly degrade the performance of the existing unicast services. Techniques that allow cellular networks to carry multicast and unicast services jointly in a more spectrum-efficient way are highly desirable. There are also many practical scenarios where a user needs to receive both multicast and unicast signals at the same time. For example, the network operator would like to offer multicast services like proactive content pushing, automatic software updates, and public group announcements to its subscribers without interrupting their on-going unicast services. Content providers can also embed personalized information (e.g., preferred subtitles and targeted advertisements) via unicast transmission along the multicast-based video streaming.
I-a Related Works
To address the need of joint multicast and unicast transmission in cellular networks, several research efforts have been made. One possible way is to use MIMO spatial multiplexing where all the multicast and unicast messages are transmitted with different beamformers and each is decoded at its desired receiver by treating all other signals as noise [4, 5, 6, 7]. The authors in  studied the adaptive beamforming for the coexistence of the multicast and unicast services in a multi-user multi-carrier system. The authors in 
introduced a joint beamforming and broadcasting technique, which exploits the surplus of spatial degrees of freedom in massive MIMO systems. Its main idea is to broadcast a common message to users whose channel state information (CSI) is unavailable and to beamform unicast messages to users whose CSI is available. The authors in introduced a content-centric beamforming design for content delivery in a cache-enabled radio access network. It includes the joint multicast and unicast beamforming problem as a special case when some users request the same content and others request distinct contents for each. The authors in  studied energy-efficient joint transmit and receive beamforming in a multi-cell multi-user MIMO system, where the users can receive unicast messages in addition to the group-specific multicast messages at the same time. Different messages are separated in the spatial domain at the users which are equipped with multiple receive antennas. Instead of using spatial multiplexing, another way is to adopt superposition coding to deliver both multicast and unicast services simultaneously. Each receiver decodes its desired multicast and unicast messages successively by using the successive interference cancellation (SIC)-based multi-user detection [8, 9, 10, 11]. More specifically, the scheduling and resource sharing problem for the superposition of broadcast and unicast in wireless cellular systems is studied in . The MIMO beamforming problem in a simple case with only two users (i.e., near and far) is studied in [9, 10]. The performance of the joint multicast and unicast transmission with partial CSI is studied in . A more general scenario is considered in  for a multi-cell network, where each base station (BS) sends multiple independent multicast messages and each user can decode an arbitrary subset of these multicast messages from all BSs using successive group decoding.
Recently, layered-division multiplexing (LDM), a form of non-orthogonal multiplexing technology , has been introduced in cellular networks for joint multicast and unicast transmission [14, 15]. It is a key technology for next-generation terrestrial digital television standard ATSC 3.0 . LDM applies a layered transmission structure to transmit multiple signals with different power levels and robustness for different services and reception environment. A receiver can decode the upper layer most robust signal first, cancel it from the received signal, and then decode the next layer signal. By using LDM, a joint beamforming design algorithm is proposed in  for minimizing the total transmit power under constraints on the user specific unicast rate and the common multicast rate. Note that, the work  only considered a fixed BS clustering scheme for both multicast and unicast beamformers without taking channel dynamics into account. The authors in  considered a similar problem but introduced a group-sparse encouraging penalty in the objective function to reduce signaling overhead among different BSs. However, neither of the above works explicitly considered backhaul constraints. In practice, each BS is usually connected to the core network and cooperates with other BSs via a backhaul link with a finite capacity. Thus, the joint transmission among multiple BSs needs to take the backhaul constraints into account explicitly.
In a different line of research on non-orthogonal multiplexing, the power-domain non-orthogonal multiple access (NOMA) [17, 18] and the rate splitting (RS) [19, 20] have been studied as promising technologies to increase system performance in wireless networks. In power-domain NOMA, two users with different channel conditions (i.e., poor and strong) are served on the same time/frequency/code resource with different power levels. The user with strong channel condition decodes the message of the user with poor channel condition first, cancels it, and then decodes its own message. Thus, the message of the user with poor channel condition can be viewed as a common message intended to both users. In RS, each user’s message is split into a common part and a private part. All common parts are packed into one common message, which is superimposed and simultaneously transmitted with the unicast messages. It has been studied as a promising strategy for robust transmission with imperfect CSI at the transmitter 
. It is worth remarking that in the power-domain NOMA with MIMO beamforming, multiple messages share a same beamforming vector but with different powers[17, 18]. On the other hand, the LDM-based non-orthogonal transmission assigns a dedicated beamforming vector for each message [14, 15], i.e., the messages are superposed with different beamformers. We also remark that while the RS signal model resembles the LDM-based non-orthogonal transmission, the role of the multicast message is fundamentally different. The multicast message in RS encapsulates parts of the unicast messages, and is decoded by all users for interference mitigation, although not entirely required by themselves , while the multicast message in the LDM-based non-orthogonal transmission carries common information intended as a whole for all users.
In this paper, we propose a new LDM-based non-orthogonal transmission framework for multicast and unicast services in multi-cell cooperative cellular networks with backhaul constraints. As in [14, 15], we adopt a two-layer LDM structure where the first layer is intended for multicast services and the second layer is for unicast services. The two layers are superposed with different network-wide beamformers which are potentially (group) sparse due to the backhaul constraints. Each user decodes the multicast message first, subtracts it, and then decodes its unicast message. Different from [14, 15], we consider dynamic BS clustering for each message with respect to instantaneous channel conditions and the per-BS backhaul constraints. Under this non-orthogonal transmission framework, we seek the maximum achievable rates of both multicast and unicast services under the peak power and peak backhaul constraints on each individual BS by the joint design of BS clustering and beamforming.
The main contributions of this paper are summarized as follows:
New Problem Formulation: We formulate a mixed-integer non-linear programming (MINLP) problem for the joint design of BS clustering and beamforming to maximize the weighted sum of the multicast rate and the unicast rate under the per-BS power and backhaul constraints. By varying the weighting parameter, we can set different priorities on the multicast and unicast services and hence obtain different achievable multicast-unicast rate pairs. Note that this problem is challenging due to the combinatorial nature of the BS clustering variables and the coupling between the BS clustering variables and rate variables in the backhaul constraints.
Optimal Branch-and-Bound Algorithm: We design a branch-and-bound (BB) algorithm to find the global optimal solution of the above formulated problem with guaranteed convergence by using the convex relaxation techniques in  and 
. Although with (theoretically) high computational complexity, the BB-based algorithm serves as a benchmark for evaluating the performance of other heuristic or local algorithms for the same problem.
High-Performance Low-Complexity Algorithm: Considering the practical implementation, we also design a low-complexity algorithm. Simulation results show that it can achieve high performance that is very close to the optimum. Specifically, we first reformulate the joint design problem as an equivalent sparse beamforming problem. The equivalent problem is still challenging due to that the per-BS backhaul constraints involve not only the discontinuous -norm but also the product of two non-convex functions. Then we use a concave smooth function to approximate the discontinuous -norm and use difference of squares to rewrite the product form. By doing so, the problem is then transformed (with approximation) into a difference of convex (DC) programming problem, for which a stationary solution can be obtained efficiently by using the convex-concave procedure (CCP) with guaranteed convergence.
Promising Simulation Results: Simulation results show that our proposed low-complexity algorithm can achieve performance that is very close to the global optimum. The results also demonstrate that our proposed LDM-based non-orthogonal scheme can achieve a significantly larger multicast-unicast rate region than orthogonal schemes. This indicates that our proposed LDM-based non-orthogonal transmission can serve as an efficient scheme to incorporate multicast and unicast services in cellular networks.
I-C Organization and Notations
The rest of the paper is organized as follows. Section II introduces the system model and the problem formulation. Section III provides the details of the proposed optimal solution based on the BB method. A CCP-based low-complexity algorithm is developed in Section IV. Simulation results are provided in Section V. Finally, we conclude the paper in Section VI.
Ii System Model and Problem Formulation
Ii-a System Model
Consider the downlink transmission of a backhaul-constrained cooperative multi-cell cellular network, where BSs, each equipped with transmit antennas, collectively provide hybrid multicast and unicast services, as shown in Fig. 1. In each scheduling slot, there are active users, each with a single antenna. Each user has a dedicated unicast request and subscribes to a group-specific multicast service. In general, there can be multiple multicast groups according to different multicast service subscriptions. In this paper, for ease of the notation, we focus on one multicast group only, i.e., there is one common multicast message intended for all users. The results obtained in this paper can be easily extended to the multi-group scenario.
The backhaul link that connects each BS to the core network, which has access to all service or content providers, is subject to a peak capacity constraint of bits/s, for all
. Due to such backhaul constraints, not every BS can participate in the transmission of every multicast and unicast messages. Let the binary variableindicate that the -th BS belongs to the serving BS cluster of the multicast message and otherwise. Similarly, let indicate that the -th BS belongs to the serving BS cluster of the unicast message for user and otherwise.
Let denote the multicast message intended for all users and the unicast message intended for user , for all , all with normalized power of . We adopt a two-layer LDM structure where the first layer is intended for the multicast service, the second layer is for unicast services, and the two layers are superposed with different beamformers at each BS. Let denote the beamforming vector at BS for the multicast message and denote the beamforming vector at BS for the unicast message , respectively. The transmit signal of BS can be written as
The total transmit power of the multicast layer and the unicast layer on each BS is subject to a peak power constraint as
where is the peak transmit power of the -th BS. Note that () if (), which implies that BS does not participate in the transmission of message (). Thus, we have the following constraint:
where is the index set of all unicast messages and one multicast message.
The received signal at the -th user is expressed as
where is the network-wide channel vector between all BSs and user , and are the network-wide beamforming vectors defined in a similar manner, and is the additive white Gaussian noise at user
. Without loss of generality, we assume that all of the channel vectors are linearly independent. We also assume that perfect CSI is available at the core network for joint processing and all BSs can precisely synchronize with each other, and focus on the beamforming design to evaluate the advantages of the proposed non-orthogonal multicast and unicast transmission framework. Typically, CSI can be collected by estimating it at each user and feeding it back to the BS via a feedback channel in frequency-division-duplex (FDD) systems, or through uplink channel estimation in time-division-duplex (TDD) systems. Each BS collects its own CSI and sends it to the central controller in the core network via its backhaul link. For time synchronization, a combination of global positioning system (GPS) and network synchronization protocol can be used for synchronizing the primary clock as well as the frame structure in distant BSs.
At each receiver, SIC is used to decode the multicast message and the desired unicast message successively while treating the unicast signals of all other users as interference. In general, the decoding order of the multicast and unicast messages at each receiver can be optimized according to the instantaneous channel condition. In this work, since the multicast message is intended for multiple users and should have a higher priority [8, 14], we assume that the multicast message is decoded and subtracted before decoding the unicast message. Thus, the signal-to-interference-plus-noise ratios (SINRs) of the multicast message and the unicast message at the -th user are respectively expressed as
Ii-B Problem Formulation
Our objective is to optimize the rate performance of both multicast and unicast services through joint design of the BS clustering scheme and the beamforming vectors subject to the peak power and peak backhaul constraints on each individual BS. This is a multi-objective optimization problem. Thus we formulate a weighted sum of the unicast rate and the multicast rate maximization problem as follows: equationparentequation
where and are auxiliary variables which represent the transmission rates in bits/s/Hz of the multicast message and the -th unicast message, respectively, is the available bandwidth of the wireless channel, and is a weighting parameter between the multicast rate and the unicast rate . For ease of notation, let , , and .
Note that besides the considered objective function, a more general form of weighted sum rate, e.g., , can be considered to account for possibly different priorities among all of the multicast and unicast services, where are weighting parameters that are determined by certain scheduling policy (e.g., proportional fair scheduler).
We also note that a minimum rate constraint for the multicast and each of the unicast services may be imposed to achieve certain quality of service, i.e., for all in practical systems. Such minimum rate constraints are all linear and hence do not change the structure of the problem (as well as the algorithm design). As such we do not consider the minimum rate constraints in problem in order to fully characterize the multicast-unicast rate tradeoff.
By varying , different priorities can be given to the multicast and the unicast services, and hence different achievable multicast-unicast rate pairs can be obtained. In the special case when , problem reduces to
which is equivalent to the sparse unicast beamforming design problem in , where the binary BS clustering variable is replaced by the indicator function . When , problem reduces to a pure multicast beamforming design problem: equationparentequation
Problem is a non-convex MINLP problem , which is NP-hard in general. Obtaining its optimal solution is challenging due to the non-convexity of the SINR constraints (7b) and (7c), the combinatorial nature of the BS clustering variable in (7g), and the coupling between the variables and in the backhaul constraint (7f). Even when the BS clustering scheme is given, is still non-convex and computationally difficult. In the following sections, we first develop a BB-based algorithm to find the global optimum of problem . We then propose a low-complexity algorithm to find a high-quality approximate solution. Both of the proposed algorithms can also be applied to problems and .
Iii BB-based Optimal Algorithm
In this section, we propose a global optimal algorithm to solve problem based on the BB method.
Iii-a Overview of the BB Method
The BB method is a general framework for designing global optimization algorithms for non-convex problems. The BB method is non-heuristic in the sense that it generates a sequence of asymptotically tight upper and lower bounds on the optimal objective value; it terminates with a certificate proving that the found point is -optimal .
A BB algorithm consists of a systematic enumeration procedure, which recursively partitions the feasible region of the original problem into smaller subregions and constructs subproblems over the partitioned subregions. An upper (for solving a maximization problem) bound for each subproblem is often computed by solving a convex relaxation problem defined over the corresponding subregion; a lower bound is obtained from the best known feasible solution generated by the enumeration procedure or by some other heuristic or local algorithms. A subproblem is discarded if it cannot produce a better solution than the best one found so far by the algorithm. The performance of the BB algorithm depends on the efficient estimation of the lower and upper bounds of each subproblem. To ensure the convergence, the bounds should become tight as the number of subregions in the partition grows.
Recently, the BB method has been used for beamforming design in cellular networks. For example, a customized BB algorithm is proposed in  for single-group multicast beamforming and then extended in  for joint multicast and unicast beamforming. A monotonic optimization based branch-and-reduce-and-bound (BRB) algorithm is proposed in  to solve the energy efficiency maximization problem in a multiuser MISO downlink system. The BRB algorithm is then extended in  for joint remote radio head selection and beamforming design in cloud radio access networks.
Iii-B Convex Relaxations
In this subsection, we introduce some effective convex relaxations for the non-convex constraints of , which play an important role in finding the lower and upper bounds in the proposed BB-based algorithm for solving the problem.
Define , then without loss of optimality the unicast SINR constraint (7c) can be rewritten as
which is a convex second-order cone (SOC) constraint when is given. For the multicast SINR constraint (7b), since all the users share the same multicast beamformer and the channel vectors are linearly independent, there is only one user’s multicast SINR constraint (assume without loss of generality it is the -th user) can be rewritten into the convex SOC form when is given, i.e.,
The rest can be represented as
Proposition 1 (, Proposition 1)
Let be the argument of , where , , and denote the set of defined by the inequality for a given , for all . Suppose that , then the convex envelope of is given by
where and .
It is easy to verify that the smaller the width of the interval , the tighter the convex envelope. As goes to zero, the set becomes and the convex envelope becomes tight.
Proposition 2 (, Theorem 2)
Suppose that , then the convex envelope of function over is given by
Recall that the convex envelope of a function over a set is the pointwise supremum of all convex functions which underestimate over , i.e., is convex and over . It is easy to see that when the box region shrinks to a point, the convex envelope becomes tight.
Iii-C Proposed BB-based Algorithm
For ease of the presentation, let be the variable vector of interest where . Here the binary variable is relaxed to be continuous. Notice that belongs to the box , where the lower and upper vertices are given by
Here, is an upper bound of the rate , each element of which can be obtained by transmitting the total available power towards a single user and cannot exceed the maximum backhaul capacity of the BSs. In specific, we have , for all , and .
Let , , and denote the box list, the upper bound, and the lower bound of the optimal objective value of the original problem at the -th iteration, respectively. Let and denote the upper bound and the lower bound of the objective value over a given box region . The proposed BB algorithm works as follows:
At the -th iteration, we select a box in and split it into two smaller ones. An effective method for selecting the candidate box is to choose the one with the largest upper bound, i.e., . The selected box is then split along the longest edge, e.g., , to create two boxes with equal size
where is the -th standard basis vector. Note that the above splitting rule takes the binary variable into account, which is adjusted to be in the Boolean set.
The bounding operation is to compute the upper and lower bounds over the newly added box , , and update the upper bound and the lower bound .
Upper Bound: The upper bound is computed by solving a convex relaxation of problem over the box .
Note that the convex envelope only takes effect when . If there is any user such that , it means can take value of the whole complex plane, and we just remove the multicast SINR constraint of user from (18).
In addition, since is restricted within the box , we have
Note that the current form of constraint (7e) may produce a loose relaxation when the binary variable is relaxed to be a continuous one, since can be possibly much smaller than . To tight the relaxation, we adopt the perspective reformulation in [29, 27] to rewrite constraints (7d) and (7e) into the following form:
which is an SOC constraint when is relaxed to be continuous.
Finally, we can obtain by solving the following relaxed problem:
Problem (24) is a convex problem, which can be equivalently reformulated as a second-order cone programming (SOCP) and efficiently solved using a general-purpose solver via interior-point methods . Note that problem (24) may be infeasible. If this happens, it indicates that the box does not contain the optimal solution and we just set and as .
After obtaining the upper bounds , for , we can form by removing from and adding and if their upper bounds are larger than or equal to the current best lower bound , i.e., . Note that the maximum of the upper bounds over all boxes in is an upper bound of the optimal objective value of the original problem. Therefore, we update the upper bound as .
Lower Bound: To obtain a lower bound, we need to find a feasible solution of the original problem . This can be done by gaining some insights from the optimal solution of problem (24).
After obtaining beamforming vector of problem (24), we can turn off some data links with small transmit power and keep the other ones active, i.e., force if is small enough and set the remaining . Since the data link with a lower power gain contributes less to the weighted sum rate and should have a higher priority to be turned off. Denote as the -th largest element of . Let
Then, we can calculate the multicast rate and unicast rate as
If the backhaul constraint (7f) is satisfied, i.e., , for all , then itself is a feasible solution of the original problem . Otherwise, we can scale to be feasible. Therefore, a feasible solution of problem is given by where
Note that for each , we can find such a feasible solution and its corresponding objective . The lower bound can be obtained by finding the best , which yields the largest objective among all feasible solutions, i.e.,
Finally, we can obtain a better lower bound of the optimal objective value of the original problem if the lower bounds of the newly added boxes can provide a larger lower bound than that of the previous iteration, i.e., .
The overall BB-based algorithm for solving problem is summarized in Alg. 1.
Iii-D Convergence and Complexity Analysis
One important condition for the convergence of the BB-based algorithm is that the upper and lower bounds over a box region become tight as the box shrinks to a point. More precisely, as the length of the longest edge of the box , denoted by , goes to zero, the gap between upper and lower bounds converges to zero. We formally summarize the result in the following lemma:
For any given and , there exists a such that and when , we have .
See Appendix A.
Lemma 1 indicates that for any given tolerance , we can always find an -optimal solution when the size of the box is sufficiently small. Note that by adopting the splitting rule (15), the size of the selected at the iteration of Alg. 1 converges to zero, i.e., . The proof is provided in  and we omit it here for brevity.
Iii-D2 Complexity Analysis
In Alg. 1, the most computationally expensive part is to calculate the upper and lower bounds in Step 2). Obtaining the upper bound requires solving an SOCP problem in the form of (24), and its worst-case computational complexity is approximately by adopting the interior-point methods . Obtaining the lower bound in (29) takes at most times; each has a complexity of , which mainly lies in the rate computation in (26) and (27). Therefore, the computational complexity of Alg. 1 at each iteration mainly comes from calculating the upper bound in Step 2). Regarding to the maximum iteration number of Alg. 1, we have the following lemma:
For any given small constant and any instance of problem , the proposed BB-based algorithm will return an -optimal solution within at most
iterations, where is the inverse function of .
See Appendix B.
Since Alg. 1 requires at most iterations to converge, the worst-case computational complexity of Alg. 1 is therefore . As we can see from Lemma 2, can be very large if the tolerance is small. Nevertheless, the proposed BB-based algorithm can be used as the network performance benchmark. Considering the practical implementation, in the next section, we will propose a low-complexity algorithm through sparse beamforming design. Simulation results show that it can achieve high performance that is very close to the optimum.
Iv CCP-based Low-Complexity Algorithm
In this section, we reformulate the joint BS clustering and beamforming problem as an equivalent sparse beamforming design problem and propose a CCP-based low-complexity algorithm to solve it approximately.