Optimal Energy Tradeoff among Communication, Computation and Caching with QoI-Guarantee

12/10/2017 ∙ by Faheem Zafari, et al. ∙ University of Massachusetts Amherst Imperial College London 0

Many applications must ingest and analyze data that are continuously generated over time from geographically distributed sources such as users, sensors and devices. This results in the need for efficient data analytics in geo-distributed systems. Energy efficiency is a fundamental requirement in these geo-distributed data communication systems, and its importance is reflected in much recent work on performance analysis of system energy consumption. However, most works have only focused on communication and computation costs, and do not account for caching costs. Given the increasing interest in cache networks, this is a serious deficiency. In this paper, we consider the energy consumption tradeoff among communication, computation, and caching (C3) for data analytics under a Quality of Information (QoI) guarantee in a geo-distributed system. To attain this goal, we formulate an optimization problem to capture the C3 costs, which turns out to be a non-convex Mixed Integer Non-Linear Programming (MINLP) Problem. We then propose a variant of spatial branch and bound algorithm (V-SBB), that can achieve e-global optimal solution to the original MINLP. We show numerically that V-SBB is more stable and robust than other candidate MINLP solvers under different network scenarios. More importantly, we observe that the energy efficiency under our C3 optimization framework improves by as much as 88 optimization between communication and computation or caching.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The rapid growth of smart environments, and advent of Internet of Things (IoT) have led to the generation of large amounts of data. However, it is a daunting task to transmit enormous data through traditional networks due to limited bandwidth and energy limitations [1]. These data need to be efficiently compressed, transmitted, and cached to satisfy the Quality of Information (QoI) required by end users. In fact, many wireless components operate on limited battery power supply and are usually deployed in remote or inaccessible areas, which necessitates the need for designs that can enhance the energy efficiency of the system with a QoI guarantee.

A particular example of modern systems that require high energy efficiency is the wireless sensor network (WSN). Consider a WSN with various types of sensors, which can generate enormous amount of data to serve end users. On one hand, data compression has been adopted to reduce transmission (communication) cost at the expense of computation cost. On the other hand, caches can be used as a mean of reducing transmission costs and access latency, thus enhancing QoI but with the expense of the added caching cost. Hence, there exists a tradeoff in energy consumption due to data communication, computation and caching. This raises the question: what is the right balance between compression and caching so as to minimize the total energy consumption of the network?

In this paper, we formulate an optimization problem to find the optimal data compression rate and data placement to minimize the energy consumed due to data compression, communication and caching with QoI guarantee in a communication network. The formulated problem is a Mixed Integer Non-Linear Programming problem with non-convex functions, which is NP-hard in general. We propose a variant of spatial branch and bound algorithm that guarantees -global111-global optimality means that the obtained solution is within tolerance of the global optimal solution. optimality.

Each node has the ability to compress and cache the data with some finite storage capacity. We focus on wireless sensor networks as our motivating example. In particular, as shown in Figure 1, we assume that only edge sensors generate data, and there exists a single sink node that collects and serves the requests for the data generated in this network. The model can be extended to include any arbitrary node that produces data at the expense of added notational complexity.

Fig. 1: A general wireless sensor network.

Computation: Data aggregation [2, 3] is the process of gathering data from multiple generators (e.g., sensors), compressing them to eliminate redundant information and then providing the summarized information to end users. Since only part of the original data is transmitted, data aggregation can conserve a large amount of energy. A common assumption in previous works is that energy required to compress data is smaller than that needed to transmit data. Therefore, data compression was considered a viable technique for reducing energy consumption. However, it has been shown [4] that computational energy cost can be significant and may cause a net-energy increase if data are compressed beyond a certain threshold. Hence, it is necessary to consider both transmission and computation costs, and it is important to characterize the trade-off between them[1].

Caching: Caches have been widely used in networks and distributed systems to improve performance by storing information locally, which jointly reduces access latency and bandwidth requirements, and hence improves user experience. Content Distribution Networks (CDNs), Software Defined Networks (SDNs), Named Data Networks (NDNs) and Content Centric Networks (CCNs) are important examples of such systems. The fundamental idea behind caching is to make information available at a location closer to the end-user. Again, most previous work focused on designing caching algorithms to enhance system performance without considering the energy cost of caching. Caching can reduce the transmission energy by storing a local copy of the data at the requesting node (or close by), hence eliminating the need for multiple retransmission from the source node to the requesting node. However, caching itself can incur significant energy costs [5]. Therefore, analyzing the impact of caching on overall energy consumption in the network (along with data communication and compression) is critical for system design.

Quality of Information (QoI): The notion of QoI required by end users is affected by many factors. In particular, the degree of the data aggregation in a system is crucial for QoI. It has been shown that data aggregation can deteriorate QoI in some situations [6]. Thus an energy efficient design for appropriate data aggregation with a guaranteed QoI is desirable.

We focus on a tree-structured sensor network where each leaf node generates data, and compresses and transmits the data to the sink node in the network, which serves the requests for these data from devices outside this network. Examples of such a setting are military sites, wireless sensors or societal networks, where a large number of devices gather data, and desire to transmit the local information to any device outside this network that requires this information. The objective of our work is to obtain optimal data compression rate at each node, and an optimal data placement in the network for minimizing energy consumption with QoI guarantee.

I-a Organization and Main Results

Section I-B presents a review of relevant literature. In Section II, we describe our system model in which nodes are logically arranged as a tree. Each node receives and compresses data from its children node(s). The compressed data are transmitted and further compressed towards the sink node. Each node can also cache the compressed data locally. In Section III, we formulate the problem of energy-efficient data compression, communication and caching with QoI constraint as a MINLP problem with non-convex functions, which is NP-hard in general. We then show that there exists an equivalent problem obtained through symbolic reformation [7] in Section IV, and propose a variant of the Spatial Branch-and-Bound (V-SBB) algorithm to solve it. We show that our proposed algorithm can achieve -global optimality.

In Section V, we evaluate the performance of our optimization framework and show that the use of caching along with data compression and communication can significantly improve the energy efficiency of a communication network. More importantly, we observe that with the joint optimization of data communication, computation and caching (C), energy efficiency can be improved by as much as compared to only optimizing communication and computation, or communication and caching (C). The improvement depends on the values of parameters in the model and the magnitude of improvement varies with different energy costs of the model. While the improvement in energy efficient is important, our framework helps in characterizing and analyzing the enhancement in energy efficiency for different network settings. We also evaluate the performance of the proposed V-SBB algorithm through extensive numerical studies. In particular, we make a thorough comparison with other MINLP solvers Bonmin [8], NOMAD [9]

, Matlab’s genetic algorithm (GA), Baron

[10], SCIP [11] and Antigone [12] under different network scenarios. The results show that our algorithm can achieve -global optimality, and the achieved objective function value (we achieve a lower objective function value for a minimization problem) is mostly better than stochastic algorithms such as NOMAD, GA while it performs comparably with deterministic algorithms such as Baron, Bonmin, SCIP and Antigone. Furthermore, our algorithm provides a solution in varying network situations even when other solvers such as Bonmin, and SCIP are not able to. We provide concluding remarks in Section VI.

I-B Related Work

To the best of our knowledge, there is no prior work that jointly considers communication, computation and caching costs in distributed networks with a QoI guarantee for end users.

Data Compression: Compression is a key operation in modern communication networks and has been supported by many data-parallel programming models [13]. For WSNs, data compression is usually performed over a hierarchical topology to improve communication energy efficiency [2], whereas we focus on energy tradeoff between communication, computation and caching.

Data Caching: Caching plays a significant role in many systems with hierarchical topologies, e.g., WSNs, microprocessors, CDNs etc. There is a rich literature on the performance of caching in terms of designing different caching algorithms, e.g., [14, 15], and we do not attempt to provide an overview here. However, none of these work considered the costs of caching, which may be significant in some systems [5]. The recent paper by Li et al. [16] is closest to the problem we tackle here. The differences between our work and [16] are mainly from two perspectives. First, the mathematical formulations are quite different, we consider energy tradeoffs between C3 while [16] focused on C2. Second, we provide a -optimal solution to a MINLP problem while [16] aimed at developing approximation algorithms.

Energy Costs: While optimizing energy costs in wireless sensor networks has been extensively studied [17], existing work primarily is concerned with routing [18], MAC protocols [17], and clustering [19]. With the growing deployment of smart sensors in modern systems [1], in-network data processing, such as data aggregation, has been widely used as a mean of reducing system energy cost by lowering the data volume for transmission.

Ii Analytical Model

Fig. 2: Tree-Structured Network Model.

We represent the network as a directed graph For simplicity, we consider a tree, with nodes, as shown in Figure 2. It is possible to generalize our framework to general network topology with arbitrary source nodes, provided that the route between the source and requesting node is known. Node is capable of storing amount of data. Let with be the set of leaf nodes, i.e., . Time is partitioned in periods of equal length and data generated in each period are independent. Without loss of generality (W.l.o.g.), we consider one particular period in the remainder of the paper. We assume that only leaf nodes can generate data, and all other nodes in the tree receive and compress data from their children nodes, and either cache or transmit the compressed data to their parent nodes during time T. Arbitrary source nodes can also be incorporated into the model at the cost of added notational and model complexity.

Let be the amount of data generated by leaf node . The data generated at the leaf nodes are transmitted up the tree to sink node which serves requests for data generated in the network. Let be the depth of node in the tree. W.l.o.g., we assume that the sink node is located at level We represent a path from node to the sink node as the unique path of length as a sequence of nodes such that where (i.e., the sink node) and (i.e., the node itself).

We denote the per-bit reception, transmission and compression cost of node as , and respectively. Each node along the path can compress the data generated by leaf node with a data reduction rate , where The reduction rate characterizes the degree to which a node can compress the received data, which plays an important role for determining the QoI.

The higher the value of , the lower the compression will be, and vice versa. The higher the degree of data compression, the larger will be the amount of energy consumed by compression. Similarly, caching the data closer to the sink node may reduce the transmission cost for serving the request, however, each node only has finite storage capacity. We study the trade-off among the energy consumed at each node for transmitting, compression and caching the data.

Denote the total energy consumption at node as , which consists of reception cost , transmission cost , computation cost and storage (caching) cost ; it takes the form


The above energy consumption models for data transmission, compression and caching have been used in literature [1, 20, 5] and are suitable for highlighting the energy consumption in a communication network. However, our formulation can be extended to incorporate various other energy consumption models as well. In (II), captures the computation energy. As computation energy increases with the degree of compression, we assume that is a continuous, decreasing and differentiable function of the reduction rate. One candidate function is [1, 20]. Moreover, we consider an energy-proportional model [5] for caching, i.e., if the received data is cached for a duration of where represents the power efficiency of caching, which strongly depends on the storage hardware technology. W.l.o.g., is assumed to be identical for all the nodes. For simplicity, denote = ++ as the sum of per-bit reception, transmission and compression cost at node per unit time.

During time period , we assume that there are requests at sink node for data generated by leaf node . For simplicity, we assume that the number of requests for the data of a node is constant. The boolean variable equals if the data from node is stored along the path at node otherwise it equals . We allow the data to be cached at only one node along the unique path between the leaf node and root node. For ease of notation, we define by Let denote the set of leaf nodes that are descendants of node . We also assume that the energy cost for searching for data at different nodes in the network is negligible [1, 15]. For convenience, let and For ease of exposition, the parameters used throughout this paper are summarized in Table I.

Notation Description
number of data (bits) generated at node
reduction rate at node , is the ratio of amount of output data to input data
the QoI threshold
per-bit reception cost of node
per-bit transmission cost of node
per-bit compression cost of node
if node caches the data from leaf node ; otherwise
storage capacity of node
caching power efficiency
request rate for data from node
total number of nodes in the network
set of leaf nodes that are descendants of node
time length that data are cached
upper bound of the objective function
list of regions
any sub-region in
upper bound on the objective function in subregion
lower bound on the objective function in subregion
difference between the upper and lower bound
lower bound on auxiliary variable in subregion
upper bound on auxiliary variable in subregion
candidate variable for branching
chosen branching variable
value at which the variable is branched
bt bilinear terms
lft linear fractional terms
set of bilinear terms (bt)
set of linear fractional terms (lft)
TABLE I: Summary of notations

Iii Energy Optimization

In this section, we first define the cost function in our model and then formulate the optimization problem. Data produced by every leaf node is received, transmitted, and possibly compressed by all nodes in the path from the leaf node to the root node, consuming energy


where if . Equation (2) captures one-time222During every time period , data is always pushed towards the sink upon the first request. energy cost of receiving, compressing and transmitting data from leaf node (level ) to the sink node (level ). The amount of data received by any node at level from leaf node is due to the compression from level to The term captures the reception, transmission and compression energy cost for node at level along the path from leaf node to the sink node.

Let be the total energy consumed in responding to the subsequent requests. We have


Note that the remaining requests are either served by the leaf node or a cached copy of data at level for W.l.o.g., we consider node at level . If data is not cached from up to the sink node (level , i.e., for the cost is incurred due to receiving, transmitting and compressing the data times, which is captured by the first term in Equation (3), the second term is . Otherwise, the requests are served by the cached copy at , the corresponding caching and transmission cost serving from are captured by the second term in Equation (3), and the corresponding reception, transmission and compression cost from upto to sink node is captured by the first term. Note that the first time cost of reception, transmission and compression the data from leaf node to is already captured by Equation (2).

We present a simple but illustrative example to explain the above equations.

Example 1.

We consider a network with one leaf node and one sink node, i.e., and Then the cost in Equation (2) becomes where the first and second terms capture the reception, transmission and compression cost for data at sink node and the leaf node, respectively.

The cost in Equation (3) is

where Term and Term capture the costs at sink node and leaf node, respectively. To be more specific, there are three cases: (i) data is cached at sink node , i.e., and (since we only cache one copy); (ii) data is cached at leaf node , i.e., and ; and (iii) data is not cached, i.e., . We consider these three cases in the following.

Case (i), i.e., and , Term becomes and Term reduces to
since all the requests are served from sink node. This indicates that the total energy cost is due to caching the data for time period and transmitting it times from the sink node to users that request it.

Case (ii), i.e., and , Term becomes , which captures the reception, transmission and compression costs at sink node for serving the requests. Term becomes , which captures the cost of caching data at the leaf node and transmitting the data times from the cached copy to the sink node . The sum of them is the total cost to serve requests.

Case (iii), i.e., , , which captures the reception, transmission and compression costs at sink node and leaf node for serving the requests since there is no cached copy in the network.

The total energy consumed in the network is ,


where and

. Our objective is to minimize the total energy consumption of the network with a QoI constraint for end users by choosing the compression ratio vector

and caching decision vector in the network Therefore, the optimization problem is,


where is the depth of node in the tree.

The first constraint is the QoI constraint, i.e., the total data available at the sink node [1]. The second constraint indicates that our decision (caching) variable is binary. The third constraint is on total amount of data that can be cached at each node. The fourth constraint is that at most one copy of the generated data should be cached on the path between the leaf node and the sink node.

The optimization problem in (5) is a non-convex MINLP problem with continuous variables, the ’s and binary variables, the ’s where, = .

Iii-a Properties

We first analyze the complexity of the problem give in (5). There are two decision variables in (5), i.e., the compression ratio and caching decision variables. To analyze the impact of these variables on the complexity of the problem, we consider two cases, (i) given the caching decisions variables , solving for the optimal compression rates and (ii) given the compression ratio , solving for the optimal cache placement decision.

Iii-A1 Given Caching Decisions

For given caching decision variables the optimization problem in (5) turns into a constrained polynomial minimization over the positive quadrant (PMoP)[21] with respect to the compression ratio that is an NP-hard problem [21].

Theorem 1.

Given fixed caching decisions the optimization problem in (5) is NP-hard.


We prove the hardness by reduction from the classical job-shop problem, which is NP-hard [22].

We can reduce the job-shop problem to our problem in (5) with fixed caching decisions as follows. Consider each node in our model to be a machine . Denote the set of machines as . The compression rate constitutes the set of jobs , where indicates the compression rate at any machine . Let be the set of all sequential job assignments to different machines so that every machine performs every job only once. The elements can be written as matrices, where column order-wise lists the sequential jobs that the machine will perform. There is a cost function that captures the cost (energy) for any machine to perform a particular job (compression) along with data transmission, reception and caching. Our objective in the optimization problem (5) is to find assignments of job to minimize energy consumption, which is equivalent to the classical job-shop problem. Since job-shop problem is NP-hard [22], our problem in (5) with given cache placement decision is also NP-hard.

Iii-A2 Given Compression Ratios

Given compression ratios , the optimization problem in (5) is only over the caching decision variables Hence, we obtain an integer programming problem, which is NP-hard.

Theorem 2.

Given a fixed compression ratio the optimization problem in (5) is NP-hard.


We prove the hardness by reduction from the classical job-shop problem, which is NP-hard [22]. The proof is the same as the proof for Theorem 1. However, the job in this case is whether to cache the data or not i.e., the caching decision constitutes the set of jobs , where means that the data is cached and means otherwise. The cost function in this case captures the cost (energy) for any machine to cache the data (along with data transmission, reception and compression). Our objective in the optimization problem (5) (with a fixed compression ratio) is to find assignments of job to minimize energy consumption, which is equivalent to the classical job-shop problem. ∎

Therefore, given the results in Theorems 1 and 2, we know that our optimization problem is NP-hard in general.

Corollary 1.

The optimization problem defined in (5) is NP-hard.

Remark 1.

The objective function defined in (5) is monotonically increasing in the number of requests for all provided that and are fixed.

Notice that (2) is independent of and (3) is linear in , and its multipliers are positive. Hence, for any fixed and , (4) increases monotonically with .

Remark 2.

Given a fixed network scenario, if we increase the number of requests for the data generated by leaf node then these data will be cached closer to the sink node or at the sink node, if there exists enough cache capacity, to reduce the overall energy consumption.

For fixed , observe from (3) that energy consumption decreases if the cache is moved closer to the root as the nodes deep in the tree do not need to retransmit.

Iii-B Relaxation of Assumptions

In our model, we make several assumptions for the sake of simplicity. In the following, we discuss the relaxation of these assumptions.

While we assume that the network is structured as a tree, this assumption can be easily relaxed as long as there exists a simple fixed path from each leaf node to the sink node. The tree structure represents a simple topology that captures the key parameters in the optimization formulation without the complexity introduced by a general network topology. Furthermore, for simplicity, we assume that all parameters across the nodes are identical, which is not necessary as seen from the cost function. We also assume that only leaf nodes generate data. However, our model can be extended to allow intermediate nodes to generate data at the cost of added complexity.

Iv Variant of Spatial Branch-and-Bound Algorithm

In this section, we present a variant of the Spatial Brand-and-Bound algorithm (V-SBB). Instead of solving the MINLP problem (5) directly, we use V-SBB to solve a standard form of the original MINLP. We first introduce the Symbolic Reformulation[7] method that reformulates the MINLP (5) into a standard form needed by V-SBB.

Definition 1.

A MINLP problem is said to be in a standard form if it can be written as


where the vector of variables consists of continuous and discrete variables in the original MINLP. The sets and contain all relationships that arise in the reformulation. and are a matrix and a vector of real coefficients, respectively. The index obj denotes the position of a single variable corresponding to the objective function value within the vector

Theorem 3.

The non-convex MINLP problem (5) can be transformed into a standard form.

Due to space constraints, we relegate detailed reformulations (see Appendix B for details of symbolic reformulation) and standard form of (5) to Appendix A.

Here, we give an example to illustrate the above reformulation process.

Example 2.

Consider the same network in Example 1, the non-convex MINLP problem becomes


is a bilinear term. Based on symbolic reformulation rules, a new bilinear auxiliary variable needs to be added. The first constraint in (2) is then transformed into which is linear in auxiliary variable . Similarly, we add for linear-fractional term that appears in in the third constraint of (2) is a tri-linear term. Since is replaced by , we obtain a bilinear term . Again, based on symbolic reformulation rules, is replaced by a new auxiliary variable . Similarly we add new auxiliary variables , and . The objective function in (2) can be then expressed as a function of these new auxiliary variables. Therefore, the standard form of (2) is


Through this reformulation, the non-convex and non-linear terms in the original problem are transformed into bilinear and linear fractional terms, which can be easily used to compute the lower bound of each region in V-SBB, which are discussed in details later. This is the reason V-SBB requires reformulating the original problem into a standard form.

Theorem 4.

Reformulated problem and the original MINLP are equivalent.

Proof is available in Section (page ) [7].

Due to the reformulation, the number of variables in the reformulated problem is larger than in the original MINLP. In the following, we show that the number of auxiliary variables that arise from symbolic reformulation is bounded.

Remark 3.

The number of auxiliary variables in the symbolic reformulation is where is the number of variables in the original formulation.

From [23], a way to transform a general form optimization problem into a standard form (6) is through basic arithmetic operations on original variables. To be more specific, any algebraic expression results from the basic operators including the five basic binary operators, i.e., addition, subtraction, multiplication, division and exponentiation, and the unary operators, i.e., logarithms etc. Therefore, in order to construct a standard problem consisting of simple terms corresponding to these binary or unary operations, new variables need to be added corresponding to these operations. From the symbolic reformulation process [23, 24, 25], any added variable results from the basic operations between two (including possibly the same) original variables or added variables. Hence, based on the basic operations, there are at most combinations of these variables, given that there are variables in the original problem (5). Therefore, the number of added variables in the symbolic reformulation is bounded as In the remainder of this section, we present the V-SBB to solve the equivalent problem.

Step 1: Initialize and to a single domain
Step 2: Choose a subregion using least lower bound rule
if  =  then Go to Step 6
if  for chosen region , is infeasible or  then Go to Step 5
Step 3: Obtain the upper bound
if upper bound cannot be obtained or if  then Go to Step 4
else and, from the list , delete all subregions such that
if  then Go to Step 5
Step 4: Partition into new subregions and
Step 5: Delete from and go to Step 2
Step 6: Terminate Search
if  then Problem is infeasible
else is -global optimal
Algorithm 1 Variant of Spatial Branch-and-Bound (V-SBB)

Iv-a Our Variant of Spatial Branch-and-Bound

The proposed spatial branch-and-bound method is a variant of the method proposed in [23] and is primarily tuned for solving our optimization problem (12) that is also the solution of (5). Our algorithm is different from [23] because

  • We do not use any bounds tightening steps as it does not always guarantee faster convergence [26] and in case of our problem slowed down the process.

  • By eliminating the bounds tightening step, we do not need to calculate the lower bound again separately and utilize the lower bound obtained in Step 2 for the chosen region , hence reducing the computational complexity of the algorithm.

Algorithm 1 provides an overview of the steps involved in spatial branch-and-bound algorithm. We describe some of the steps in Algorithm 1 in detail below.

Step : There are a number of approaches that can be used to choose a subregion from [27]. Here we use the least lower bound rule, i.e., we choose a subregion that has the lowest lower bound among all the subregions, since it is a widely used and well researched method. The lower bound can be obtained by solving a convex relaxation of the problem in (12). As our optimization problem in (5) and (12

) contains only bilinear and linear fractional terms, we use McCormick linear over-estimators and under-estimators

[28] (see Appendix C) to obtain a convex relaxation of all such terms. The resulting problem is then a Mixed Integer Linear Programming (MILP) problem that we solve using the SCIP solver [11]. The SCIP solver is a faster and well known solver for MILP problems. The subregion with lowest lower bound is then used as the region to explore for an optimum. The chosen regions’ lower bound is used as . If the convex relaxation is infeasible or if the obtained lower bound is higher than the existing upper bound of the problem, we fathom or delete the current region by moving to step .
Step : In step , we calculate the upper bound for the subregion chosen in Step . This can be done in a number of ways (see [23]), here we use local MINLP solver such as Bonmin [8] to obtain a local minimum for the subregion as it performed better in terms of time than using local non-linear programming optimization with fixed discrete values or added discreteness constraints in our simulation settings. If the upper bound for the region cannot be obtained or if it is greater than then we move to Step to further divide the region and search further for a better solution. Otherwise we set it as the current best solution and delete all the subregions whose lower bound is greater than the obtained upper bound since all such regions cannot contain the -global optimal solution. If the difference between the upper and lower bound for the region is within the -tolerance, the current subregion need not to be searched further, then we delete the current subregion by going to step , otherwise we move to step for further searching in the space.
Step : Step also known as the branching/partitioning step helps in partitioning/dividing a region to further refine the search for solution. In branching step, we select a variable for branching/partitioning as well as the value of the variable at which the region is to be divided. There are a number of different rules and techniques that can be used for branching (see [27] for detailed discussion). Here we use the variable selection and value selection rule specified in [24], since it has been found efficient for our problem [24].

We branch on the variable that causes the maximum reduction in the feasibility gap between the solution of convex relaxation (solution of Step 2) and the exact problem. To do so, the approximation error for the bilinear and linear fractional terms in (12) is calculated using (9a) and (9b) respectively where means the value of the variable obtained in Step 2. The variable with the maximum approximation error of all is chosen as the branching variable as that tightens the gap between the relaxation and the exact problem [24]. This results in two candidate variables for branching i.e. and . If one of the variables is discrete (binary in our case) and the other is continuous then choose the discrete variable since it will result only in finite number of branches. However, if both variables are of the same type (either binary or continuous), then the branching variable is chosen using (10) i.e. we choose the variable that has its value closer to its range’s midpoint. However, we first need to obtain the branching value for the candidate variables (the value at which to branch). should be between the upper and lower bounds of the variable in the region i.e. . The rules for the choice of the branch point have been set in [24], however we restate them here for sake of completeness.

  • Set to the value obtained in Step , i.e.,

  • If any feasible upper bound has been obtained and , then and stop the search for the value.

  • If step provided an upper bound for the subregion , then .

After obtaining the branch point value, we have all the parameters required for (10) and can then choose the variable for branching.


We partition the subregion into and and add , into our region list . Then we move to Step and delete the subregion from the list .

Iv-B Convergence of Spatial Branch-and-Bound

The spatial branch-and-bound method guarantees convergence to -global optimality, which has been proven in [26]. However, for sake of completeness, we restate the proof in the Appendix D.

V Evaluation

Fig. 3: Candidate network topologies used in the experiments: (a) one sink node and one leaf node; (b) one sink node and two leaf nodes; (c) one sink node, one intermediate node and two leaf nodes; and (d) one sink node, two intermediate nodes and four leaf nodes.

We evaluate the performance of our communication, compression and caching (C) joint optimization framework through a series of experiments on several network topologies as shown in Figure 3. Our goal is to analyze the performance of C and assess the improvement in energy efficiency that can be achieved by jointly considering C costs when compared with C. While highlight the performance gain is valuable, characterizing the performance of C in different settings and parameters, and obtaining the optimal caching location and data compression rate is also of great significance. We also compare the performance of our V-SBB algorithm with some other well-known solvers.

The highlights of the evaluation results are:

  • Our C joint optimization framework improves energy efficiency by as much as compared to the C optimization over communication and computation, or communication and caching. This shows the significance of jointly considering C energy costs.

  • The improvement in energy efficiency with C framework increases with an increase in the number of requests and the network size. Furthermore, data of nodes that had largest number of requests ’s are cached at the sink node or closer to the sink node.

  • While comparing different MINLP solvers, V-SBB algorithm can obtain an -global optimal solution in most situations. We vary the network parameters and find that V-SBB is able to obtain a feasible solution in all settings. SCIP, Baron, Bonmin and Antigone are faster in obtaining solutions. However, they are either not able to obtain solutions in all the settings or they provide an objective value higher than our algorithm particularly for lower values of .

V-a Methodology

Our primary goal is to highlight the improvement in energy efficiency that is achieved using the C framework when compared with C. We define the energy efficiency as:


where and are the optimal energy costs under the C optimization framework in (5) and the C optimization, respectively. reflects the reduction of energy efficiency for the C over the C optimization. While, the increase in energy efficiency using C framework is noteworthy, characterizing the magnitude of the improvement and the parameters that significantly impact the energy efficiency is important. Such characterization can help in identifying the operation regions

for the network and then accordingly devising heuristic algorithms for specific operation regions. We also compare the performance of V-SBB with other MINLP solvers and show that it performs comparably with other MINLP solvers for our C


Solver Characteristics
Bonmin [8] A deterministic approach based on Branch-and-Cut method that solves relaxation problem with Interior Point Optimization tool (IPOPT), as well as mixed integer problem with Coin or Branch and Cut (CBC).
NOMAD [9] A stochastic approach based on Mesh Adaptive Direct Search Algorithm (MADS) that guarantees local optimality. It can be used to solve non-convex MINLP and has a relatively good performance.
GA [29] A meta-heuristic stochastic approach that can be tuned to solve global optimization problems. We use Matlab Optimization Toolbox’s implementation.
SCIP[11] One of the fastest, non-commercial, deterministic global optimization solver that uses branch-and-bound algorithm for solving MINLP problems.
Baron[10] A deterministic global solver for MINLP problems that relies on Branch and Cut approach for solving MINLP problems.
Antigone[12] A deterministic global solver for MINLP problems that relies on special structure of the problem and uses Branch and Cut approach to solve the problem.
TABLE II: Characteristics of the solvers used in this paper

Setup: We implement V-SBB in Matlab on a Core i GHz CPU with GB RAM. The candidate MINLP solvers in this work include Bonmin, NOMAD and GA, which are implemented with Opti-Toolbox [30]. We summarize the characteristics of these solvers in Table II. Note that these solvers can be applied directly to solve the original optimization problem in (5), while our V-SBB solves the equivalent problem. The reformulations needed are executed by a Java based module and we derive the bounds on the auxiliary variables. We also relax the integer constraint in (5) to obtain a non-linear programming problem, which is solved by IPOPT [31] and use it as a benchmark for comparison. V-SBB terminates when -optimality is obtained or a computation timer of seconds expires. We take in our study. If the timer expires, the last feasible solution is taken as the best solution. For cases, where no solution is obtained within the specified timer, we increase the timer limit to seconds. Our simulation parameters are provided in Table III, which are the typical values used in the literature [1, 17, 32].

Parameter Value Parameter Value (Joules)
1000 50 10
100 200 10
1.88 10 80 10
10s []
TABLE III: Parameters used in simulations
Fig. 4: Total Energy Costs vs. Number of Requests.
Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s)
Bonmin 0.010 0.076 0.018 0.07 0.026 0.071 0.032 0.077 0.039 0.102
NOMAD 0.012 1.036 0.038 0.739 0.033 0.640 0.038 0.203 0.039 0.263
GA 0.010 0.286 0.018 2.817 0.026 7.670 0.042 11.020 0.064 3.330
V-SBB 0.010 18.231 0.018 17.389 0.026 12.278 0.032 7.327 0.039 19.437
SCIP Inf 0.07 0.0012 0.07 0.005 0.05 0.011 0.087 0.039 0.05
Baron 0.01 0.91 0.018 0.79 0.026 0.77 0.032 0.87 0.039 0.49
Antigone 0.01 0.195 0.018 0.18 0.026 0.175 0.032 0.19 0.039 0.2
TABLE IV: The Best Solution to the Objective Function (Obj.) and Convergence time for two nodes network
Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s)
Bonmin 0.0002 0.214 0.039 0.164 0.078 0.593 0.117 0.167 0.156 0.212
NOMAD 0.004 433.988 0.121 381.293 0.108 203.696 0.158 61.093 0.181 26.031
GA 0.043 44.538 0.096 30.605 0.164 44.970 0.226 17.307 0.303 28.820
V-SBB 0.0001 1871.403 0.039 25.101 0.078 30.425 0.117 23.706 0.156 19.125
SCIP NC 5901.7 NC 7200 NC 4829.4 NC 7200 0.156 1.37
Baron 0.0002 00.74 0.039 1002.14 0.078 7200 0.117 3.41 0.156 0.15
Antigone 0.0002 3.57 0.039 0.38 0.081 0.34 0.117 0.32 0.156 0.13
TABLE V: The Best Solution to the Objective Function (Obj.) and Convergence time for seven nodes network

V-B Efficacy of the C Framework

Figure 4 shows the increase in energy consumption with increase in the number of requests in different network and compression settings. We observe that as the number of requests increases, the total energy cost increases, as reflected in Remark 1. An important observation is that the initial increase in the energy cost is large. However, when the data are cached (number of requests ), the slope decreases. This is because the transmission cost is usually much larger than the caching cost (using the energy proportional model for caching [5]) and once the data are cached, the cached copy is used to satisfy other requests.

For the energy efficiency, we compare the total energy costs under joint C optimization with those under C optimization. We consider two cases for the C optimization: (i) Co (Communication and Computation), where we set for each node to avoid any data caching; (ii) Ca (Communication and Caching), where we set which is equivalent to , i.e., no computation. Comparison between C, Co and Ca is shown in Figure 5. For the parameters that we used in simulation, the energy cost for the C joint optimization is lower than that for Co optimization for the same parameter setting. This highlights the improvement that can be achieved using C framework. In other words, although C incurs caching costs, it may significantly reduce the communication and computation, which in turn brings down total energy cost. Similarly, C optimization outperforms Ca. Using Equation (11), energy efficiency improves by as much as for the C framework when compared with the C formulation. These trends are observed in other candidate network topologies. Figure 6 shows the improvement that C brings in comparison with C for a two nodes network. Using Equation (11), energy efficiency improves by as much as for the C framework when compared with the C formulation. The results for three nodes and four nodes networks are presented in Tables VIII and IX.

Remark 4.

Note that the above results are based on parameter values typically used in the literature, as shown in Table III. From our analysis, it is clear that the larger the ratio between and , , the larger will be the improvement provided by our C formulation.

V-C Comparison of Solvers

We compare the performance of our proposed V-SBB with other MINLP solvers in terms of:

V-C1 The Best Solution to the Objective Function

We compare the performance of V-SBB with three other candidate solvers for the networks in Figure 3. The results for two nodes and seven nodes networks are presented in Tables IV and V. We observe that V-SBB, Bonmin, SCIP, Antigone, and Baron achieve comparable objective function value for larger values of , while V-SBB outperforms other algorithms for lower values of

(discussed in detail later). Furthermore, Bonmin and SCIP cannot generate a feasible solution even if it exists for some cases. Particularly, for Bonmin, there are a number of probable reasons for such a problem: a) For MINLP problems with non-convex functions, Bonmin relies on heuristic options and does not guarantee

-global optimality [33]. The heuristics can cause such problems; b) The Branch-and-Cut method, used by Bonmin, is based on outer-approximation (OA) algorithm [34]. For the MINLP with non-convex functions, OA constraints do not necessarily result in valid inequalities for the problem. Hence Bonmin’s Branch-and-Cut method sometimes cuts regions where a lower value exists. NOMAD and GA in general yield a higher objective-function value than V-SBB does. This is because both NOMAD and GA are based on a stochastic approach which cannot guarantee convergence to the -global optimum. Similar trends are observed for three and four node networks.

V-C2 Convergence Time

The time taken to obtain the best solution is important in practice. The amount of time that an algorithm requires to obtain its best solution as discussed in Section V-C1 are shown in Tables IV and V for the two nodes and seven nodes networks, respectively. It can be seen that Bonmin, Antigone, Baron and SCIP (when it is able to provide a solution) are the fastest methods. However, Bonmin, SCIP and Baron sometimes cannot find a solution although it exists.V-SBB takes longer to obtain a better solution, because our reformulation introduces auxiliary variables and additional linear constraints. Different applications can tolerate various degrees of algorithm speed. For the sample networks and applications under consideration, the speed of V-SBB is considered to be acceptable [27].

V-C3 Stability

From the analysis in Sections V-C1 and V-C2, we know that Bonmin is faster but unstable in some situations. We further characterize the stability of Bonmin with respect to the threshold value of QoI as follows. Specifically, we fix all other parameters in Table III, and vary only the maximal possible value of in different networks. The results are shown in Table VI. For each maximal value, we test all the possible integer values of between and itself. Hence, the number of tests equals the maximal value. We see that the number of instances where the Bonmin method fails to produce a feasible solution increases as the network size increases.

Furthermore, Bonmin, Baron and Antigone can provide a feasible solution for smaller values of at a faster time, we observe that the value of the solution is larger than that of V-SBB. We compare the performance of V-SBB with these algorithms for smaller values of in Table VII. We see that V-SBB outperforms Bonmin, Antigone and Baron by as much as , , and , respectively when searching for an -global optimum, though it requires more time. The timer is set to s for results shown in Table VII. Results for three node and four node networks are given in Tables VIII and IX respectively. SCIP, for certain instances of the three node network, provides the lowest objective function value. However, for majority of the cases, we observe similar trends like Tables IV and V.

Networks (a) (b) (c) (d)
of test values 1000 2000 2000 4000
of infeasible solutions 0 0 1 216
Infeasibility (%) 0 0 0.05 5.4
TABLE VI: Infeasibility of Bonmin for networks in Figure 3
Solver =1 =3 =5 =8 =50
Obj. Time (s) Obj. Time Obj. Time Obj. Time Obj. Time
Bonmin 0.0002 0.214 0.0003 0.211 0.0003 0.224 0.0005 0.23 0.0021 0.364
Antigone 0.0002 3.57 0.000317 2.47 0.000395 6.53 0.000512 15.61 0.002153 2.71
Baron 0.0002 0.74 0.00031 4846 0.00039 7200 0.005 7200 0.0021 7200
V-SBB 0.00011 1871 0.00015 2330 0.00019 1243 0.00047 1350 0.0020 3325
Improvement over Bonmin (%) 52.45 49.43 50.30 7.59 4.62
Improvement over Antigone (%) 50 52.72 51.92 8.27 7.08
Improvement over Baron (%) 50 51.61 51.28 6 4.79
TABLE VII: Comparison between V-SBB and Bonmin for smaller values of in seven node network
Fig. 5: Comparison of C and C optimization for the seven node network in Figure 3.
Fig. 6: Comparison of C and C optimization for the two nodes network in Figure 3.
Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s)
Bonmin 0.005 0.26 0.01 0.14 0.019 0.10 0.028 0.10 0.0383 5.56
NOMAD 0.045 12.42 0.025 11.14 0.033 9.30 0.029 46.41 0.038 5.45
GA 0.005 0.69 0.025 26.11 0.019 16.85 0.034 40.56 0.044 10.34
V-SBB 0.005 46.1 0.019 45.34 0.019 8.1 0.028 56.3 0.0383 12.2
SCIP 0.00005 4.96 0.000056 0.16 0.000054 0.18 0.028 0.07 0.038 0.05
Baron 0.005 0.1 0.01 0.09 0.019 0.09 0.028 0.1 0.0383 0.1
Antigone 0.005 0.11 0.01 0.09 0.019 0.08 0.028 0.21 0.038 1.51
TABLE VIII: The Value of Objective Function (Obj.) and Convergence Speed for three node network
Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s) Obj. Time (s)
Bonmin 0.002 0.36 0.02 0.11 0.039 0.11 0.06 0.10 0.08 0.16
NOMAD 0.003 112.5 0.023 97.68 0.04 59.86 0.06 52.8 0.10 2.28
GA 0.004 1.01 0.02 24.94 0.04 13.02 0.12 27.7 0.14 35.33
V-SBB 0.02 400 0.02 400 0.039 400 0.071 400 0.078 400
SCIP 0.002 7.05 0.02 1999.4 0.0004 2.00 0.009 0.43 0.04 0.16
Baron 0.002 0.52 0.02 2.69 0.039 0.89 0.06 0.16 0.078 0.1
Antigone 0.002 21.2 0.02 0.26 0.042 0.18 0.06 0.1 0.078 0.08
TABLE IX: The Value of Objective Function (Obj.) and Convergence Speed for four node network

Vi Conclusion

We have investigated energy efficiency tradeoffs among communication, computation and caching with QoI guarantee in distributed networks. We first formulated an optimization problem that characterizes these energy costs. This optimization problem belongs to the non-convex class of MINLP, which is hard to solve in general. We then proposed a variant of the spatial branch-and-bound (V-SBB) algorithm, which can solve the MINLP with -optimality guarantee. Finally, we showed numerically that the newly proposed V-SBB algorithm outperforms the existing MINLP solvers, Bonmin, NOMAD and GA. We also observed that C3 optimization framework, which to the best of our knowledge has not been investigated in the literature, leads to an energy saving of as much as when compared with either of the C2 optimizations which have been widely studied.

Going further, we aim to extend our results in two ways. The first is to refine and improve the symbolic reformulation to reduce the number of needed auxiliary variables in order to shorten the algorithm execution time. Second, since many networking problems involve the optimization of both continuous and discrete variables as in this work, we plan to apply and extend the newly proposed V-SBB to solve those problems.


The material in this paper has been accepted for publication in part at IEEE Globecom, Abu Dhabi, United Arab Emirates, December 2018. This work was supported by the U.S. Army Research Laboratory and the U.K. Ministry of Defence under Agreement Number W911NF-16-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Army Research Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copy-right notation hereon. Faheem Zafari also acknowledges the financial support by EPSRC Centre for Doctoral Training in High Performance Embedded and Distributed Systems (HiPEDS, Grant Reference EP/L016796/1), and Department of Electrical and Electronics Engineering, Imperial College London. The authors will also like to thank Dr. Ruth Misener and the Chemical Engineering Department at Imperial College London for providing us with the access to Baron and Antigone Solvers.


  • [1] S. Nazemi, K. K. Leung, and A. Swami, “QoI-aware Tradeoff Between Communication and Computation in Wireless Ad-hoc Networks,” in Proc. IEEE PIMRC, 2016.
  • [2] R. Rajagopalan and P. K. Varshney, “Data Aggregation Techniques in Sensor Networks: A Survey,” IEEE Commun. Surveys Tuts., vol. 8, no. 4, pp. 48––63, 2006.
  • [3] E. Fasolo, M. Rossi, J. Widmer, and M. Zorzi, “In-network Aggregation Techniques for Wireless Sensor Networks: a Survey,” IEEE Wireless Communications, vol. 14, no. 2, 2007.
  • [4] K. C. Barr and K. Asanović, “Energy-aware Lossless Data Compression,” ACM Transactions on Computer Systems, 2006.
  • [5] N. Choi, K. Guan, D. C. Kilper, and G. Atkinson, “In-network Caching Effect on Optimal Energy Consumption in Content-Centric Networking,” in Proc. IEEE ICC, 2012.
  • [6] S. A. Ehikioya, “A Characterization of Information Quality Using Fuzzy Logic,” in NAFIPS, 1999.
  • [7] E. M. Smith and C. C. Pantelides, “Global Optimisation of General Process Models,” in Glo. Opt. Eng. Des.   Springer, 1996, pp. 355–386.
  • [8] P. Bonami et al., “An Algorithmic Framework for Convex Mixed Integer Nonlinear Programs,” Disc. Opt., vol. 5, no. 2, pp. 186–204, 2008.
  • [9] S. Le Digabel, “Algorithm 909: NOMAD: Nonlinear Optimization with the MADS Algorithm,” ACM TOMS, vol. 37, no. 4, p. 44, 2011.
  • [10] M. Tawarmalani and N. V. Sahinidis, “A polyhedral branch-and-cut approach to global optimization,” Mathematical Programming, vol. 103, no. 2, pp. 225–249, 2005.
  • [11] T. Achterberg, “SCIP: Solving Constraint Integer Programs,” Mathematical Programming Computation, vol. 1, no. 1, pp. 1–41, 2009.
  • [12] R. Misener and C. A. Floudas, “Antigone: algorithms for continuous/integer global optimization of nonlinear equations,” Journal of Global Optimization, vol. 59, no. 2-3, pp. 503–526, 2014.
  • [13] O. Boykin, S. Ritchie, I. O’Connell, and J. Lin, “Summingbird: A Framework for Integrating Batch and Online Mapreduce Computations,” Proc. of VLDB, 2014.
  • [14] J. Li, S. Shakkottai, J. C. S. Lui, and V. Subramanian, “Accurate Learning or Fast Mixing? Dynamic Adaptability of Caching Algorithms,” IEEE Journal on Selected Areas in Communications, 2018.
  • [15] S. Ioannidis and E. Yeh, “Adaptive Caching Networks with Optimality Guarantees,” in Proc. of ACM SIGMETRICS, 2016.
  • [16] J. Li, F. Zafari, D. Towsley, K. K. Leung, and A. Swami, “Joint Data Compression and Caching: Approaching Optimality with Guarantees,” in Proc. of ACM/SPEC ICPE, 2018.
  • [17] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan, “Energy-Efficient Communication Protocol for Wireless Microsensor Networks,” in System sciences, 2000.
  • [18] A. Manjeshwar and D. P. Agrawal, “TEEN: a Routing Protocol for Enhanced Efficiency in Wireless Sensor Networks,” in IPDPS, 2001.
  • [19] M. Ye, C. Li, G. Chen, and J. Wu, “EECS: an Energy Efficient Clustering Scheme in Wireless Sensor Networks,” in Proc. of IEEE IPCCC, 2005.
  • [20] S. Eswaran, J. Edwards, A. Misra, and T. F. L. Porta, “Adaptive In-Network Processing for Bandwidth and Energy Constrained Mission-Oriented Multihop Wireless Networks,” IEEE Transactions on Mobile Computing, vol. 11, no. 9, pp. 1484–1498, Sept 2012.
  • [21] M. Chiang et al., “Geometric programming for communication systems,” Foundations and Trends® in Communications and Information Theory, vol. 2, no. 1–2, pp. 1–154, 2005.
  • [22] A. S. Jain and S. Meeran, “Deterministic Job-Shop Scheduling: Past, Present and Future,” European journal of operational research, vol. 113, no. 2, 1999.
  • [23] E. M. Smith and C. C. Pantelides, “A Symbolic Reformulation/Spatial Branch-and-Bound Algorithm for the Global Optimisation of Nonconvex MINLPs,” Comp. & Chem. Eng., vol. 23, no. 4, pp. 457–478, 1999.
  • [24] E. M. Smith, “On the Optimal Design of Continuous Processes,” Ph.D. dissertation, Imperial College London (University of London), 1996.
  • [25] L. Liberti, “Reformulation and Convex Relaxation Techniques for Global Optimization,” 4OR: A Quarterly Journal of Operations Research, vol. 2, no. 3, pp. 255–258, 2004.
  • [26] L. Liberti, “Reformulation and Convex Relaxation Techniques for Global Optimization,” Ph.D. dissertation, Imperial College London, 2004.
  • [27] C. A. Floudas, Deterministic Global Optimization: Theory, Methods and Applications.   Springer Science & Business Media, 2013, vol. 37.
  • [28] G. P. McCormick, “Computability of Global Solutions to Factorable Nonconvex Programs: Part I—Convex Underestimating Problems,” Mathematical Programming, vol. 10, no. 1, pp. 147–175, 1976.
  • [29] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II,”

    IEEE transactions on evolutionary computation

    , vol. 6, no. 2, pp. 182–197, 2002.
  • [30] OPTI Toolbox, “A Free Matlab Toolbox for Optimization,” https://www.inverseproblem.co.nz/OPTI/index.php/Main/HomePage, [Online; accessed 28-Jun-2017].
  • [31] A. Wächter and L. T. Biegler, “On the Implementation of an Interior-point Filter Line-search Algorithm for Large-scale Nonlinear Programming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006.
  • [32] W. Ye, J. Heidemann, and D. Estrin, “An Energy-Efficient MAC Protocol for Wireless Sensor Networks,” in Proc. of IEEE INFOCOM, 2002.
  • [33] A. Fiat and P. Sanders, “Algorithms-esa 2009,” Lecture Notes in Computer Science, vol. 5757, 2009.
  • [34] P. Bonami and J. Lee, “BONMIN Users’ Manual,” https://projects.coin-or.org/Bonmin/browser/stable/1.5/Bonmin/doc/BONMIN_UsersManual.pdf?format=raw, 2011.

Appendix A