I Introduction
Among the disruptive changes introduced by 5G networks, a major one is represented by the blurring of the distinction between forwarding equipment (e.g., switches) and computational facilities (e.g., servers). Indeed, backhaul and fronthaul nodes of 5G networks (hereinafter referred to as B/F nodes) will be endowed with computational, storage, and networking capabilities, allowing them to run any virtual network function (VNF), from switches to video transcoders. VNFs are subsequently combined into VNF graphs, which define the services made available to higher network layers or third parties (e.g., vertical industries operating in the automotive, ehealth, or media domain).
In this context, the entities of the management and orchestration (MANO) framework are in charge of making and implementing a set of complex decisions, including (i) activation of B/F nodes, so as to minimize the energy they consume, hence the costs for the operator; (ii) which VNF instances each B/F node shall run, in order to honor the delay constraints associated with the supported services; (iii) how traffic should be routed through the links connecting the B/F nodes. In traditional networks, these decisions could be made separately, owing to the fact that they concern different sets of equipment. Network design problems took as an input a static traffic matrix and, similarly, server placement problems assumed a known and immutable network topology. In 5G, on the other hand, decisions – e.g., activating or deactivating a B/F node – affect both the forwarding and computational capabilities of the network. It follows that traditional approaches may be ineffective, and often not even viable.
The nature of 5G traffic further exacerbates this challenge. Indeed, as exemplified in Fig. 1, traffic flows in 5G need to traverse a logical graph whose vertices are VNFs; such graphs can have arbitrary complexity and are not restricted to being chains or directed acyclic graphs (DAGs). The task of the MANO entities can be described as matching such a logical graph with a physical graph whose vertices are B/F nodes and whose edges are the links, be them physical or virtual, that connect them. Such a matching must account for the fact that the quantity of traffic does not remain constant across processing steps (i.e., VNFs); in other words, the usual flow conservation laws do not hold.
Fig. 1, depicting the VNFs composing the virtual Evolved Packet Core (vEPC), depicts a typical example of this situation. Dataplane traffic flows from the remote radio head (RRH) to the eNodeB (eNB), and thence to the Packet/Service Gateway (P/SGW). However, such a flow generate additional controlplane flows, e.g., going from the eNB to the Home Subscriber Server (HSS) through the Mobility Management Entity (MME). Even data traffic may not remain constant: as an example, firewalls and deep packet inspection (DPI) VNFs can drop some flows, thereby decreasing the network traffic from a processing step to the next.
Along with these challenges, the hybrid nature of 5G network nodes and their ability to be programmed through software results in significant opportunities, including the possibility to optimize the management of the network. Indeed, optimization is traditionally used in network design, but it is regarded to as too complex for their realtime management. In our work, we depart from this vision and integrate optimization within the MANO framework, thereby allowing its entities to make and implement highquality and realtime decisions.
The main contributions of our paper are as follows:

a model, capturing the unique features of 5G network nodes (e.g., their hybrid nature) and of the traffic they serve (e.g., no flow conservation);

a problem formulation, allowing us to make joint decisions on (i) B/F node activation, (ii) number and placement of the VNF instances, and (iii) traffic routing;

a solution concept, named OptiLoop, predicated on integrating optimization in the loop of the decisions made by MANO entities, namely the NFV orchestrator (NFVO);

two implementations of OptiLoop, one within a realworld testbed based on OpenStack and OpenDaylight, and one within a largerscale network emulated in Mininet.
The remainder of this paper is organized as follows. We review related work in Sec. II, and explain how our own work fits within the management and orchestration (MANO) framework proposed by ETSI in Sec. III. Next, we present our system model and problem formulation in Sec. IV, and detail the OptiLoop solution strategy in Sec. V. We then describe our testbeds’ architecture, reference scenario and benchmarks in Sec. VI, present numerical results in Sec. VII, and conclude the paper in Sec. VIII.
Ii Related work
Many works on VNF placement and traffic routing, including [1, 2, 3], take the approach of matching
VNF and physical topology graphs, also proposing efficient solution strategies for the ensuing mixedinteger linear programming (MILP) problems. The optimization objectives are: minimizing network usage in
[1], minimizing VNF deployment cost in [2], minimizing CAPEX and OPEX in [3]. The later work [4] takes an iterative approach, making VNF placement and routing decisions when a request arrives. [5] takes the VNF placement as given and focuses on scheduling and routing.Other works focus on the interaction between mobile operators and third parties using their services. As an example, [6] considers a market where operators bid to serve incoming demands. Among energyaware works, [7] seeks to optimize VNF placement and job scheduling in order to minimize energy consumption. However, the algorithm presented in [7] optimizes the server utilization but neglects the energy consumed by network elements such as B/F nodes.
Among the services that can be provided through SDN/NFVbased networks, a prominent example is the EPC. As suggested by the survey in [8]
, ILP and MILP are the most popular modeling tools, and heuristic algorithms the most popular solution strategy. A common theme
[9, 10, 11] is splitting EPC elements, e.g., the Packet Gateway (PGW) and Service Gateway (SGW), into separate subelements, one dealing with control traffic and the other with user traffic. [12] finds that such an approach reduces the total cost of ownership. Interestingly, other works, e.g., [13, 14], take the opposite approach and merge PGW and SGW in a single entity (the P/SGW). [13] focuses on the MME and proposes to implement it through four separate VNFs, whose number can vary so as to accommodate traffic fluctuations. Closer to our own effort is the recent work in [15], which studies the problem of placing the VNFs implementing the main EPC network functions – SGW, PGW and MME – across the available physical machines, subject to limits on their power and link capacity. A preliminary version [16] of this work addressed the same problem, albeit in simpler scenarios and with a more limited scope.Iia Novelty
Our approach is novel with respect to the above works in several important ways:

first and foremost, the scope of our work: we jointly account for (i) the number and placement of VNF instances, (ii) traffic routing, and (iii) network management, e.g., activating/deactivating B/F nodes and links;

at the modeling level: accounting for the complexity of 5G traffic, with requests that originate at a network endpoint and traverse multiple VNFs, triggering additional requests as they do so (hence the quantity of traffic changes across processing steps);

as far as objectives are concerned: adopting energysaving as our priority and using detailed and realistic energy models, instead of proxy metrics as in [7];

from a solution strategy viewpoint: optimizing in the loop, i.e., using optimization as a tool rather than a mere analysis technique;

at implementation level: validating and testing our approach through a testbed based on OpenDaylight and OpenStack.
Iii OptiLoop and the ETSI MANO framework
The management and orchestration (MANO) framework, standardized by ETSI in [17], includes a set of decisionmaking entities (functional blocks) in charge of managing NFVbased networks, along with the interfaces (reference points) between them. The highlevel goal of the framework is to map the key performance indicators (KPIs) chosen by the verticals, e.g., maximum endtoend latency, into decisions concerning the network resources, e.g., the activation/deactivation of (virtual) servers and the placement of VNFs therein. In the following, we present a short overview of the framework and then, in Sec. IIIA, discuss the relationship between the NFV orchestrator, one of the most important MANO entities, and OptiLoop.
Fig. 2, taken from [17], shows the decisionmaking entities of the MANO framework (within the blue area), along with the nonMANO entities they interact with. OSS/BSS (Operation and Business Support Services), at the topleft corner, are the contact point between verticals and mobile operators: they collect the vertical requirements, expressed through endtoend KPIs, and convey them, through the OsManfvo reference point, to the NFV Orchestrator (NFVO). The NFVO itself is arguably one of the most important entities of the MANO framework, and is in charge of the orchestration decisions. Specifically, given the vertical requirements and the state of the network infrastructure, the NFVO determines:

how many instances of each VNF to deploy;

where in the network infrastructure they shall run;

the features of the virtual network connecting the VNF instances, e.g., the bandwidth of the links to traverse between them.
Through the Orvnfm interface, these decisions are sent to the VNFM (VNF Manager), which takes care of instantiating the VNFs, requesting to the VIM (Virtual Infrastructure Manager) the needed resources, e.g., virtual machines or virtual links. The VIM, in turn, interacts with the NFVI (NFV Infrastructure), which includes the servers running the VNFs and the hypervisors managing them. The VNFM also communicates with a nonMANO entity called EM (Element Manager), in charge of FCAPS (Fault, Configuration, Accounting, Performance and Security) management, in order to configure the VNFs or collect/monitor KPIs from them.
Iiia The NFVO: input, output, and decisions
The NFVO is in charge of most of the orchestration tasks in the MANO framework. Owing to its importance, in the following we detail the decisions it is in charge of, along with the input information it has access to; such pieces of information correspond, respectively, to the output and input of OptiLoop.
The main input data used by the NFVO is the NSD (Network Service Descriptor), a data structure defined in [17, Sec. 6.2.1]. NSDs contain a graph description of the VNFs needed by each service, called VNFFG (VNF Forwarding Graph) [17, Sec. 6.5.1] along with deployment flavor information, including the maximum latency acceptable for each service[17, Sec. 6.2.1.3]. Furthermore, the NFVO has access to information on the network infrastructure, e.g., the state and capabilities of network and computing resources available at the NFV infrastructure, including details about the connectivity among the servers where the VNFs will be allocated.
Using all the above, the NFVO makes decisions about:

the status of network infrastructure elements, e.g., servers;

VNF lifecycle management [17, Sec. 7.2] about the VNFs, including the host they run at;

routing, accounting for the capacity and delay of virtual links.
Such decisions will correspond to decision variables in our system model, as detailed next.
Iv System model
Our model is based on two graphs, a logical one and a physical one. For simplicity, we describe it with reference to unidirectional traffic; notice however that our model and our results also account for bidirectional traffic. Tab. I summarizes all the notation we introduce below.
Iva The logical graph
The logical graph, exemplified in Fig. 1, describes where, i.e., which endpoint, the traffic comes from, and how it is processed. Its vertices are either endpoints or VNFs . With reference to Fig. 1, we have , and .
On the logical graph, we have logical flows representing data originating from endpoint and going from VNF to VNF . Additionally, with an abuse of notation, we indicate with flows that start from endpoint and are first processed at VNF , e.g., from the RRH to the eNB in Fig. 1. Note that keeping track of the endpoint at which flows originate, i.e., having an index in our variables, serves a manifold purpose. First, it allows our model to account for the fact that different types of traffic (i.e., originating from different endpoints) may need different processing, i.e., traverse different VNF graphs. Furthermore, such VNF graphs may overlap; in this case, keeping track of the origin of the flows makes it possible to distinguish them even if they traverse the same VNF. Finally, it allows routing each flow in a different way, in both the logical and the physical graph. Notice that different traffic flows coming from the same physical endpoint can be distinguished by associating them to different logical endpoints.
Another important aspect of the system is that there is no flow conservation in the logical graph. As an example, in Fig. 1 we see a user flow of traffic unit going from the RRH to eNB and thence to the gateway, which triggers some additional control traffic from the eNB and the gateway to the MME. Indeed, the following generalized flow conservation law holds for each endpoint and VNFs :
The above expression represents the logical flow originated at endpoint , outgoing from VNF and directed to VNF . Such a quantity is equal to the sum between logical flows entering , from either a VNF or the endpoint itself, multiplied by a factor . In particular, is used to quantify the amount of logical flow directed to that is generated when traffic coming from is processed at VNF . With reference to the eNB in Fig. 1, we have , while . Similarly, for the gateway, we have . At the MME we have flow conservation, i.e., . In , we abuse the notation and allow the first index of to be an endpoint instead of a VNF. We remark that values lower than one can also represent, e.g., a firewall dropping some of the incoming traffic. Also notice that values different from one can happen for both control traffic (e.g., the eNB in Fig. 1) and user traffic (as in the case of the firewall).
IvB The physical graph
In the physical graph, vertices correspond to the endpoints and the B/F nodes . In general, B/F nodes have computational capabilities ; B/F nodes that cannot host any VNF (e.g., switches) have . Fig. 3 presents a possible implementation of the logical VNF graph in Fig. 1, where VNFs are placed on each of the two B/F nodes with processing capabilities. For simplicity, we present our model with reference to the case where multiple VNF instances can be deployed across different nodes, but at most one instance of each VNF can be deployed at each B/F node.
Traffic traversing link is also subject to a network delay . Such a delay is static, i.e., every unit of traffic traversing link incurs a delay . Furthermore, links have a bandwidth , corresponding to the maximum amount of traffic that can go from B/F node to B/F node without generating congestion.
Our main variable is represented by physical flows , representing the amount of traffic that was originated from endpoint , last visited VNF , will next visit VNF , and is now traveling on link . Recall that we have to keep track of the flow originating endopint, in order to model traffic routing. If the flow has never been processed, i.e., it is going from to its first VNF , we will conventionally set and write .
Given a B/F node , we denote by the amount of traffic that is just transiting by (i.e., it is not processed at ) and it was originated at , last visited VNF and will next visit VNF . Similarly, is the traffic that is processed at B/F node , it was originated at , and last visited VNF . Note that implies that an instance of VNF is deployed at .
Traffic being processed at VNF is subject to a delay . Normally, processing delay is linked to the amount of resources (e.g., CPU) allocated to each VNF, and such an amount depends on the other VNFs deployed at the same B/F node. In our case, however, energy is the main metric of interest, and we can therefore assume that no VNF will be allocated more resources than the minimum amount required by the VNF itself.
A first constraint we need to impose is that, given a generic VNF , the traffic originated at , that has been processed through VNF and is entering B/F node , is either (i) processed at an instance of located in , or (ii) transiting by while being routed toward an instance of . Thus, for any , we have:
(1) 
A similar constraint concerns the traffic outgoing from . For any , endpoint and VNFs , we have:
(2) 
where is the last VNF that traffic visited, either before arriving at (if traffic just transits by ) or at itself (if is deployed therein, i.e., ). instead is the VNF that traffic will visit next. In other words, (1)–(2) enforce ordinary flow conservation for the traffic that is transiting at , i.e., using as a traditional switch, and generalized flow conservation for the traffic that is processed at .
Next, we need to ensure that we only use active B/F nodes and links, and their capacity is not exceeded. We define two sets of binary variables,
and , indicating whether link and B/F node are active or not.For links, we need to impose:
(3) 
i.e., no link can be active if either of its ends is off, and
(4) 
Symbol  Type  Meaning 

Set  Set of network endpoints  
Set  Set of B/F nodes  
Set  Set of links  
Set  Set of VNFs  
Parameter  Bandwidth of link  
Parameter  Delay of link  
Parameter  How much traffic resulting from the processing at VNF , which was previously processed at VNF , is meant to be next processed at VNF  
Binary var.  Whether we deploy VNF at B/F node  
Function  Energy consumption due to placing a VNF at a B/F node  
Function  Energy consumption due to activating a B/F node  
Function  Trafficdependent energy consumption due to processing  
,  Function  Trafficdependent energy consumption at switches and links 
Parameter  Computational capability of B/F node  
Parameter  Logical flow originated at and going from VNF to VNF  
Parameter  Logical flow originating at and first being processed at VNF  
Continuous var.  How much traffic coming from users connected to endpoint for service that was last processed at VNF is processed by an instance of VNF deployed at B/F node  
Parameter  Computational capability required to process one traffic unit of VNF  
Parameter  Computational capability consumed by one unit of traffic transiting by B/F node (SW switch)  
Continuous var.  How much traffic coming from users connected to endpoint that was last processed at VNF and meant to be next processed at VNF goes through link  
Continuous var.  How much traffic originating from that was last processed at VNF and meant to be next processed at VNF transits (without processing) by B/F node  
Binary var.  Whether link is active  
Binary var.  Whether B/F node is active 
With regard to processing, inactive B/F nodes cannot host any VNF. We track this through a binary variable expressing whether an instance of VNF is deployed at B/F node , and impose:
(5) 
Additionally, no processing can be done for VNFs that are not deployed at a given B/F node:
(6) 
Finally, each traffic unit processed by VNF requires computational capability, and, assuming is a software switch, each unit of traffic switched by consumes CPU. Clearly, the computational capability of each B/F node must be sufficient for both, i.e., for any B/F node ,
(7) 
where multiplies the total traffic outgoing from .
Next, we ensure that the delay of the traffic originated at any endpoint does not exceed a threshold :
(8) 
The two terms on the left hand side of (8) correspond to the network and processing delay, respectively. The first term of (8) is a summation of terms in the form , each representing the delay incurred by traversing link weighted by the fraction of traffic traversing it. Similarly, the second term of (8) is a summation of terms in the form , weighting the processing delay of VNF by the fraction of traffic processed by it.
At last, logical and physical flows have to match. To this end, it is sufficient to impose that, for each logical flow going from endpoint to VNF , there are corresponding physical flows of the type , such that:
(9) 
Eq. (9) ensures that the traffic injected from endpoints to B/F nodes on the physical graph matches the logical traffic going from endpoints to VNFs. Thanks to the flow conservation constraints (1)–(2), this also implies that such traffic is processed and transformed as dictated by the parameters, i.e., that all physical flows match their logical counterpart.
IvC Energy and objective
There are five contributions to the overall energy consumption we are interested in tracking:

activating a B/F node, resulting in a consumption of ;

placing a VNF on a B/F node, resulting in a consumption due to, e.g., virtual machines overhead;

using said VNF, resulting in a consumption of depending on the computational resources used;

switching traffic at a B/F node, resulting in a consumption of depending on the traffic switched by the node;

having traffic going through links, resulting in a consumption of depending on the traffic over each link.
The corresponding energy consumption is:
Given all this, our objective can be written as:
(10) 
subject to the following constraints:
(11) 
IvD Multiple VNF instances
So far, we have presented our problem formulation with reference to the case that at most one instance of each VNF can be placed at each B/F node. This is true in many cases; however, there are situations (e.g., coexisting services with isolation requirements) when we may need to place multiple instances of the same VNF at the same B/F node. In the following, we extend our model to describe such a case.
To begin with, we need to distinguish VNFs from VNF instances. To this end, we introduce a new set representing the VNF instances, and indicate as the type of instance , i.e., the VNF is an instance of.
Furthermore, we need to account for the fact that logical flows happen between VNFs, while physical flows happen between VNF instances and processing takes place at VNF instances. Therefore, we need to replace:

with , where ;

with ;

with .
In order to guarantee that physical and logical flows match, we also need to replace (9) with:
(12) 
where, in (12), the second summation accounts for all instances of VNF .
Finally, (2) needs to be changed in order to represent the fact that that data can flow from any instance of a VNF to any instance of the next VNF in the logical graph:
(13) 
In (13), notice how the variable, which concerns logical flows, has as its indices VNFs in , while the  and variables have as indices VNF instances in .
V The OptiLoop strategy
The problem stated in Sec. IV falls into the MILP category, and is thus impractical to solve in real time. We can however solve its relaxed version, where binary variables are allowed to take any value in . Optimal solutions to the relaxed models cannot be directly used to manage (or plan) a network; however, they can provide useful guidelines.
Our basic idea of is to leverage the softwaredefined nature of our network to make an optimizer interact with SDN controllers and NFVOs, i.e., optimize problems as a part of our network management strategy. Our solution strategy is called OptiLoop (for Optimization in the Loop) and it includes the following steps, as outlined in Fig. 4: (i) we initialize the system with a feasible (albeit potentially suboptimal) solution, as detailed in Sec. VA; (ii) after that, we periodically:

check that the network configuration is adequate to the current (and/or predicted) demand;

if not so, activate additional VNFs, B/F nodes, and/or links as needed;

check whether there are B/F nodes and/or links that can be deactivated in order to save energy;

if so, update the current network configuration accordingly.
Sec. VA explains how we obtain the initial solution, i.e., Item (i) above. Items (1)–(2) and (3)–(4) correspond to the fixProblems and saveEnergy procedures respectively, which are described in Sec. VB and Sec. VC.
It is worth pointing out that the fixProblems and saveEnergy procedures are designed to take no action if no action is warranted, and therefore there is no harm in cascading them. As an example, fixProblems will never take any action the first time it is executed after an initial solution is generated, as that solution is guaranteed to be feasible. Similarly, saveEnergy is unlikely to find elements to deactivate if fixProblems just had to activate some.
Va Initial solution
The initial solution used to initialize OptiLoop has to be feasible, but does not have to be optimal. It can come from one of the heuristics we reviewed in Sec. II, or it can be obtained by solving a version of our problem where:

all B/F nodes and links are active, i.e., and ;

there is an instance of all VNFs deployed at each B/F node, i.e., .
The resulting solution will be highly suboptimal, as we are likely to needlessly activate B/F nodes and/or links and to place useless VNF instances, all of which increase the power consumption. On the plus side, the problem is LP, as all binary variables are fixed; furthermore, the following property holds.
Property 1
If a problem instance is feasible, then there is at least one feasible solution where the and variables are all set to .
Proof:
Let us consider a feasible solution , where some of the binary variables are set to zero and others to one. By hypothesis, is feasible. What we need to prove is that changing all binary variables to one can never make us violate a constraint. This follows by inspection of (3), (4), (5), (6): if they hold for the variable values in , then they will also hold when all binary variables are set to one.
In other words, setting all binary variables to one is an easy way to obtain a feasible solution to our problem to start with. This solution can be vastly improved, as discussed next.
VB The fixProblems procedure
The highlevel goal of the fixProblems procedure is to check whether the current network configuration can cope with the current (and projected) traffic demand. If this is not the case, then we take one or more of the following actions: (i) activating additional B/F nodes; (ii) activating additional links; (iii) deploying additional VNF instances.
Specifically, as detailed in Alg. 1, we take as an input the current solution . We then proceed, in Line 2–Line 5, to create a new instance of the problem, where all binary variables are fixed to their values in . In Line 6, we solve such a problem: if it is feasible, then no action is required and the algorithm exits (Line 8). Otherwise, we look at why the problem is infeasible, by inspecting its irreducible inconsistent subsystem (IIS), i.e., the subset of constraints such that removing any of them would make the problem feasible. This set allows us to discriminate between the different reasons that can make the network unable to operate properly (hence, the problem infeasible).
If constraint (4) (mandating that no link is used for more than its capacity) is in the IIS, then we need to activate some more links and/or B/F nodes. To decide which ones, we relax all  and variables related to B/F nodes and links that were inactive in (Line 10–Line 11) and solve the new problem (Line 12
). We then choose one link to activate, with a probability proportional to its relaxed
value, and fix to 1 the corresponding value and the values of its endpoints (Line 13–Line 15). We then go back to Line 6 and test the new solution (Line 16). If it is still infeasible, we will activate further network elements until feasibility is achieved.We proceed in a similar way if constraint (7) is in the IIS, i.e., if we have a computational capability issue. We relax variables and , allowing for more B/F nodes to be activated and VNFs to be deployed if needed, and solve the new problem obtaining the relaxed values (Line 18–Line 20). We then have to decide which VNF to place and where. We do so by selecting a B/F node and a VNF at random, with a probability proportional to the relaxed values , and fix the corresponding and variable to (Line 21–Line 23). Finally, we go back to testing the new solution (Line 24).
VC The saveEnergy procedure
We can think of the saveEnergy procedure as the dual of fixProblems. Our aim is to identify B/F nodes and/or links that can be deactivated, as well as VNF instances that can be removed from the B/F nodes they run into. The objective is to reduce our power consumption without impairing our ability to serve the traffic, i.e., without making the problem infeasible. As in the fixProblems procedure, we solve a sequence of LP problems with fixed or relaxed variables, obtaining guidance on the decisions we should make and their effects.
In Alg. 2, we take the current solution as an input. We then create an instance of the problem where the binary variables that in the current solution have value 0 are fixed to 0 (Line 3–Line 5), and those that have currently value 1 are relaxed (Line 6–Line 8). This is because we are not looking for new nodes/links to activate, but for elements to deactivate. We do so by solving the problem instance (Line 9); note that all binary variables therein are fixed or relaxed, so the problem is LP.
In Line 10–Line 12 we identify the link, B/F node, and pair of B/F node and VNF that are active in the current solution and have the lowest value of the associated relaxed variable (respectively , , and ). Intuitively, these are the elements that most likely can be deactivated without impairing network functionality. We check this by creating a copy of problem instance and fixing to the binary variable associated to the element with the lowest value of the relaxed variables (Line 13–Line 21). If that element is a B/F node, we also need to deactivate the links using it and the VNF instances it hosts (Line 18–Line 19).
The difference between and is that exactly one element that was active in is deactivated in , hence is also LP. In Line 22, we solve and check if it is feasible. If that is the case, then we use as our new solution, and try to further enhance it (Line 24–Line 25). Otherwise, the algorithm returns , the last feasible solution we tried.
In summary, Alg. 2 deactivates zero or more elements, i.e., B/F nodes, links, or VNF instances. The element to deactivate is chosen based on the value taken by the corresponding relaxed variable, and after each change we check that the resulting configuration can serve its load, i.e., the problem instance is feasible.
Condition  Switch 1  Switch 2  Switch 3  Switch 4  Switch 5  Switch 6  All switches 

All paths off  21.0299  21.0281  21.02183  20.9614  20.9678  21.0173  125.9841 
Path 1 on, no traffic  35.0349  20.9888  35.0096  21.0168  20.9670  34.9968  167.9023 
Path 1 on, with traffic  35.4876  21.0455  35.6104  20.9996  21.0180  35.4558  168.2835 
Paths 1–2 on, no traffic  35.0309  34.9947  34.9646  20.9988  20.9869  34.9846  181.9242 
Paths 1–2 on, with traffic  35.2771  35.2135  35.6386  21.0171  20.9685  35.2783  182.1381 
Paths 1–3 on, no traffic  34.9826  34.9894  34.9645  20.9861  20.9693  35.0037  181.9220 
Paths 1–3 on, with traffic  35.6249  35.7221  35.5753  21.0042  20.9898  35.5849  183.6007 
Time Component  Maximum  Minimum  Average 

OptiLoop  15.117  6.144  9.729 
Server activation  0.111  0.068  0.086 
Switch activation  2.660  0.871  1.824 
Virtual links creation  31.797  21.953  27.369 
Single VNF creation  38.823  30.235  31.476 
Creation of all VNFs  49.738  31.883  38.384 
Network path setup  0.065  0.050  0.056 
Single VNF configuration  187.199  53.854  105.860 
Configuration of all VNFs  316.992  86.885  216.591 
Total NS instantiation  404.868  164.036  299.522 
VD Computational complexity
The fixProblems and saveEnergy procedures are run in order to react to changes in the network load; therefore, it is important that the decisions they make are swift as well as effective. To this end, we can prove that both procedures have polynomial worstcase computational complexity, as stated by the following theorem:
Theorem 1
The fixProblems (Alg. 1) and saveEnergy (Alg. 2) procedures have polynomial worstcase computational complexity.
Proof:
The proof follows by inspection of Alg. 1 and Alg. 2. The algorithms contain no loops, i.e., each of the instructions therein is executed at most once. Among the instruction, all perform elementary operations, except:

finding the minimum of a set, which requires sorting and has complexity , being the set size;

solving convex optimization problems, which has polynomial, namely, cubic computational complexity [19].
Thus, the overall complexity of the fixProblems and saveEnergy procedures is polynomial, namely, cubic.
Theorem 1 ensures that the fixProblems and saveEnergy can be used to make swift and effective decisions in reaction to traffic changes. Indeed, convex optimization problems are routinely [19] solved in embedded applications with realtime requirements.
Vi Testbeds, scenario and benchmarks
We validate and evaluate OptiLoop through two testbeds. We study the interaction between OptiLoop, the SDN controller, and the NFVO in a smallscale testbed with real hardware, described in Sec. VIA. For our performance evaluation we instead use a larger, emulated testbed based on the realworld topology of a mobile operator, as detailed in Sec. VIB. In all experiments, the reference VNF graph is the vEPC service described in Fig. 1.
Via Realworld testbed
The architecture and topology of our realworld testbed are described in Fig. 5. OpenDaylight (Beryllium version) and OpenStack (Mitaka version) are used to control a network made of six Lagopus software switches (with DPDK support enabled for faster switching) and three physical servers, connected as shown in Fig. 5(b). The OpenDaylight SDN controller configures the data plane, by activating/deactivating links and switches via SNMP protocol and configuring the forwarding rules via OpenFlow 1.3 protocol. A custombuilt NFVO – integrated with the VNFM (VNF manager) and VIM (Virtual Interface Manager) OpenStack modules – manages the VMs that run the VNFs. Specifically, the NFVO provides RESTful interfaces that allows the orchestration of network services. Services themselves are is composed by multiple VNFs, which are interconnected through the specification of a VNF graph. A detailed description of its architecture and implementation can be found in [20, Sec. 2.6]. We adopt the OpenAirInterface [14] vEPC implementation, including the four VNFs in Fig. 1.
OptiLoop is implemented as a standalone application, written in Java and including two main components, devoted to monitoring and decisionmaking. OptiLoop interacts with both OpenDaylight and the NFVO through their REST APIs, gathering uptodate information on the status of switches, links, physical servers and VNFs. When a decision is made, it communicates it to OpenDaylight (if the decision concerns link activation/deactivation) or the NFVO (if the decision concerns VNF deployment or server activation/deactivation). The decisionmaking component essentially implements Alg. 1 and Alg. 2, using the Gurobi solver for optimization. Since Gurobi features Java bindings, using it within the OptiLoop application is as simple as importing a library.
ViB Emulated testbed
Our performance evaluation is carried out through an emulated testbed based on Mininet, the de facto standard solution to study SDNbased networks. Its architecture is summarized in Fig. 6(a): similarly to the previous case, OptiLoop interacts with the OpenDaylight controller for network management, and directly with Mininet via its Python API to turn servers and switches on and off. Notice that the actual VNFs are not implemented in Mininet; the traffic they serve is emulated via iperf
and the energy consumption is estimated from our realworld testbed, as detailed in Sec.
VIIA1 next.The switches and servers emulated by Mininet reproduce the realworld topology of a major mobile operator, as detailed in Sec. VIB1. Links and servers are implemented through the TCLink and CPULimitedHost Mininet classes, which allow us to assign them bandwidth, delay and computational capability matching those of their realworld counterparts. All iperfgenerated traffic is based on the realworld traffic figures we have access to.
ViB1 Network topology and traffic
Our reference topology, displayed in Fig. 6(b), represents the realworld topology of a major mobile network operator. It includes 42 endpoints and 51 B/F nodes, with each endpoint connected to exactly two B/F nodes. A total of 1,497 antennas are connected to the endpoints. In the trace, perendpoint traffic varies between and . In order to model future network conditions, we increase such values by accounting for the 22% annual growth rate foreseen by Cisco [21] for the next five years, thus obtaining perendpoint traffic values varying between and per endpoint, with a 82:18 downlink/uplink proportion. The dataset we use only represents a snapshot of the network conditions, i.e., traffic demand does not change over time.
Based on the realworld vEPC implementation [14] we consider a total of four VNFs, namely eNB, MME, HSS, and a gateway implementing both the PGW and SGW functions. Notice that in [14] no VNF is split into user and controlplane subentities. We set our values, expressing how traffic gets transformed as it travels between VNFs, leveraging the analysis in [10]; in particular, the fraction of control traffic going to the MME is given by .
Still based on [10], we set the link bandwidth to for endpointtonode links and for nodetonode ones. Based on [10] and [22], we assume that each B/F node can process of traffic every second. Since our scenario is constrained by B/F node and link capacity, we ignore network and processing delays.
ViB2 Benchmark solutions
We compare OptiLoop with three alternatives:

what is done in realworld systems, i.e., keeping all network elements active regardless of traffic, indicated as All on in the plots;

the optimal solution obtained by bruteforce, i.e., trying all possible combinations of network elements to activate, indicated as Optimal in the plots;

a stateoftheart approach based on consolidation, based on [7] and indicated as Consolidation in the plots.
The consolidation procedure used in [7] consists of threestage decision process. For every flow, it first looks for an alreadydeployed VNF to serve the flow; if none can be found, it deploys a new instance of the VNF at an already active B/F node. If no suitable node is found, it activates a new one. Also, the procedure activates any additional B/F nodes needed to ensure connectivity between endpoints and the serving B/F nodes. It is interesting to notice how all stages of the consolidation design process have the same goals of our fixProblems procedure, namely, ensuring that there is enough computational capability (steps 1 and 2) and network capacity (step 3) to process the incoming traffic. There is no equivalent for the saveEnergy procedure, i.e., alreadymade decisions are never reconsidered.
Vii Results
We start this section by summarizing, in Sec. VIIA, the power consumption and delay figures we obtain from the realworld testbed described in Sec. VIA. We then present, in Sec. VIIB, a performance evaluation of OptiLoop carried out by emulating a realworld topology in Mininet, as described in Sec. VIB.
Viia Results from the realworld testbed
There are two main types of information we seek to obtain from the realworld testbed described in Sec. VIA:

the power consumption associated with B/F nodes, broken down in idle and processing power;

the delay associated with changes to the network, e.g., activating a link or instantiating a new VM.
We measure the above quantities through two experiments, namely, a path instantiation experiment and a service provisioning one, as described next.
ViiA1 Path instantiation experiment
In this experiment, we start with all equipment – switches and servers – in sleeping mode. We then instantiate, one by one, the three paths shown in Fig. 5(b), activating additional switches as needed. Finally, we generate bidirectional flows of between each pair of endpoints, so as to ascertain the impact of traffic on the power consumption.
The evolution of the power consumption in our realworld testbed is exemplified in Fig. 7. In the beginning, when all network elements are in sleeping mode, the total power consumption is around . Activating new servers results in an increase in power consumption, as can be expected. More interestingly, instantiating a new path results in a power increase only if it requires activating a new switch, as is the case of path 1 and path 2. As we can see from Fig. 5(b), path 3 requires no extra switches with respect to path 1 and path 2, and therefore instantiating it results in no additional consumption.
Tab. II provides a more analytical view of the power consumed by the switches in different states. When all equipment is in sleeping mode (first row), each switch consumes roughly of power. Instantiating path 1 (second row) requires activating switches 1–3 and 6, whose power consumption jumps to ; activating additional paths has the same effect on the other switches. We can also observe that sending traffic over the instantiated paths has a noticeable, but minor, effect: routing of traffic results in an additional consumption of around per switch. Finally, notice that the last column of Tab. II does not match the line in Fig. 7 since the latter also includes the consumption of the physical servers, i.e., in sleeping mode and roughly when active.
ViiA2 Service provisioning experiment
In the service provision experiment, we are interested in measuring the delay associated with performing changes to the network, including path instantiation and service provisioning. To this end, we use the network described in Sec. VIA to provide the virtual EPC (vEPC) service, consisting of the VNFs depicted in Fig. 1, as implemented in [14].
Doing so requires three main steps, namely (i) making VNF placement and traffic routing decisions, i.e., running OptiLoop; (ii) setting up the required paths, similar to the path instantiation experiment described in Sec. VIIA1; (iii) instantiating and configuring the VMs that run the VNFs. The aspect we are chiefly interested in is the relative importance of such delay components. The results are summarized in Tab. III. A first, important observation is that OptiLoop only accounts for a small fraction (roughly 3%) of the total delay; in other words, the energy savings it brings come at a modest price in terms of additional delay.
Among the other delay components, we can observe that VM configuration and, to a lesser extent, virtual link creation dominate the total delay. It is also interesting to notice the values labeled “Creation of all VNFs” and “Configuration of all VNFs”, which are substantially less than four times the creation (resp. configuration) of a single VNF. This is because, once decisions are made by OptiLoop, they can be implemented in a parallel fashion.
ViiB Emulationbased performance evaluation
The first answer we seek from the performance evaluation carried out through the emulated testbed concerns the magnitude of possible energy savings. In Fig. 8(left), we vary the traffic demand between 0.5 and 3 times the realworld amount, and study how much energy we can save compared to what is done today, i.e., leaving all B/F nodes and links active. We can observe that OptiLoop yields dramatic savings, consistently very close to the optimum, while consolidation does not perform as well. An intuitive reason is that OptiLoop accounts for all the three main contributions to energy consumption (processing, idle power, and networking), while the consolidationbased approach focuses on keeping the number of active B/F nodes low.
Fig. 8(center) shows the spare computational capability of the active topology (CCAT); intuitively, this is a measure of how much power is being wasted, i.e., how inefficient the network management strategy is. The consolidation algorithm has the highest spare CCAT, because of the higher number of B/F nodes that have to be activated in order to guarantee connectivity. The spare CCAT yielded by OptiLoop is much lower, and very close to the optimum. It is interesting to remark that even the optimum leaves substantial spare CCAT. This is due to the fact that some B/F nodes have to be active in order to keep the topology connected, even if they do not have to host any VNF. Fig. 8(right) depicts how many hops data travels across the network. OptiLoop again matches the optimum, while the consolidation strategy results in substantially longer paths, due to the fact that VNF placement decisions are made without accounting for connectivity.
We now use the power consumption we measure from our realworld testbed (Sec. VIIA) to extrapolate the total power that the emulated network would consume. Fig. 9 breaks such a consumption into its main components, namely, processing, networking, and idle power. Note that these components have comparable magnitude, i.e., none of them dominates the overall consumption. It follows that network management strategies have to account for them all. We can also see that the processing component never changes across strategies, since the amount of traffic to process is always the same. The difference between the strategies lies mostly in the networking component (longer paths in Fig. 8(right) correspond to higher consumption) and, to a lesser extent, in the idle energy. In other words, it is important to place VNFs close to the traffic they have to serve, while at the same time activating as few B/F nodes as possible.
Dropping the “all on” strategy to keep plots easy to read, Fig. 10(left) and Fig. 10(center) show that placing VNFs close to the traffic they serve also means placing many of them. This goes against the traditional concept of activating only the strictly required number of elements, and it is a direct consequence of the features of modern, softwarebased networks. Indeed, there is little or no penalty for placing an underutilized VNF instance on an already active B/F node, while there is a significant energy cost for transferring even modest amounts of data between B/F nodes. Indeed, we can say that OptiLoop outperforms stateoftheart alternatives because it properly accounts for the unique features of 5G, thus being more aggressive in deploying VNFs.
Comparing Fig. 10(left) to Fig. 10(center), we can see that OptiLoop deploys more VNFs than the optimum, but the number of VNFs per B/F node is similar. This is because OptiLoop activates slightly more B/F nodes than the optimum, as confirmed by Fig. 10(right) showing that the average amount of traffic processed per B/F node is slightly lower in OptiLoop.
ViiC Scaledup network topology
In the following, we investigate the performance of OptiLoop when used on largerscale network topologies. To this end, based on indications from the mobile operator that provided us with the original topology described in Sec. VIB1, we generate a scaledup version thereof. Specifically, we operate as follows:

we replace each B/F node of the original topology with a ring of five B/S nodes;

we place an additional 160 endpoints connected to 6,000 additional antennas;

we connect each additional endpoint to two randomlychosen B/S nodes;

we set the traffic requested by the additional antennas in such a way that the traffic distribution matches the original one, scaled up by a factor of five.
The resulting topology, depicted in Fig. 11, has over 200 B/F nodes serving traffic coming from 7,500 antennas. The results yielded by OptiLoop and the consolidation algorithm are reported in Fig. 12. Notice that there are no “optimal” curves, as computing the optimum for the scaledup topology proved utterly impractical.
Fig. 12(left) shows that, as the topology gets larger, OptiLoop – and, to a lesser extent, consolidation – yield more savings, almost reaching 50%. Intuitively, this is connected to the fact that in larger topologies it is easier to maintain connectivity while deactivating a substantial fraction of B/F nodes. This is confirmed by Fig. 12(center), showing that the spare CCAT, i.e., the unused computational power in the active network, is proportionally lower than in the original topology. Indeed, as we can see from Fig. 8(center), the spare CCAT with the original topology reaches 2,500 units under OptiLoop, while in Fig. 12(center) it is below 10,000 units in spite of the topology being five times larger.
Finally, Fig. 12(right) breaks the total power consumption into its main components. By comparing it with Fig. 9, we can observe that:

the processing power is exactly five times larger than in the original topology, as that component is strictly proportional to the traffic to serve;

the idle power is proportionally lower since, as observed earlier, there are fewer B/F nodes activated only for sake of connectivity;

the networking power is proportionally larger, as data are more likely to travel a longer path to the serving B/F node.
The latter two items suggest that networking power and idle power are, to a certain extent, antithetical, and it can be hard to minimize both at the same time.
Viii Conclusion
We considered two of the unique features of 5G networks, namely, the hybrid nature of their nodes (which have both forwarding and computational capabilities) and the fact that the traffic to serve changes across processing steps. Such features require the entities in the MANO layer, and especially the NFVO, to make joint decisions about (i) which B/F nodes to activate, (ii) the VNF instances they run, and (iii) how to route traffic between VNFs and the nodes running them. We formulated a system model and optimization problem, that enable us to make all such decisions with the objective to minimize the energy consumption of the network. We further proposed OptiLoop, a solution concept based on integrating optimization within the MANO entities, allowing them to make decisions by repeatedly solving relaxed optimization problems.
We validated OptiLoop through a realworld testbed based on OpenDaylight and OpenStack, and further evaluated its performance through a largescale emulated network whose topology and traffic are based on those of a major network operator. OptiLoop was shown to outperform stateoftheart approaches and closely track the optimum, while representing only a minor contribution to the total network delay.
References
 [1] N. Gazit, F. Malandrino, and D. Hay, “Coopetition between network operators and content providers in SDN/NFV core networks,” in IEEE INFOCOM SWFAN Workshop, 2016.
 [2] R. Cohen, L. LewinEytan, J. S. Naor, and D. Raz, “Near optimal placement of virtual network functions,” in IEEE INFOCOM, 2015.
 [3] L. Wang, Z. Lu, X. Wen, R. Knopp, and R. Gupta, “Joint Optimization of Service Function Chaining and Resource Allocation in Network Function Virtualization,” IEEE Access, 2016.
 [4] T.W. Kuo, B.H. Liou, K. C.J. Lin, and M.J. Tsai, “Deploying chains of virtual network functions: On the relation between link and server usage,” in IEEE INFOCOM, 2016.
 [5] L. Qu, C. Assi, and K. Shaban, “Delayaware scheduling and resource optimization with network function virtualization,” IEEE Trans. on Communications, 2016.
 [6] X. Zhang, Z. Huang, C. Wu, Z. Li, and F. C. Lau, “An Online Stochastic BuySell Mechanism for VNF chains in the NFV market,” IEEE Journal on Selected Areas in Communications, 2017.
 [7] N. El Khoury, S. Ayoubi, and C. Assi, “EnergyAware Placement and Scheduling of Network Traffic Flows with Deadlines on Virtual Network Functions,” in IEEE CloudNet, 2016.
 [8] V. G. Nguyen, A. Brunstrom, K. J. Grinnemo, and J. Taheri, “SDN/NFVbased Mobile Packet Core Network Architectures: A Survey,” IEEE Communications Surveys Tutorials, 2017.
 [9] A. Baumgartner, V. S. Reddy, and T. Bauschert, “Mobile core network virtualization: A model for combined virtual core network function placement and topology optimization,” in IEEE NetSoft, 2015.
 [10] G. Hasegawa and M. Murata, “Joint Bearer Aggregation and ControlData Plane Separation in LTE EPC for Increasing M2M Communication Capacity,” in IEEE GLOBECOM, 2015.
 [11] S. Khairi, M. Bellafkih, and B. Raouyane, “QoS management SDNbased for LTE/EPC with QoE evaluation: IMS use case,” in SDS, 2017.
 [12] X. An, W. Kiess, J. Varga, J. Prade, H.J. Morper, and K. Hoffmann, “SDNbased vs. softwareonly EPC gateways: A cost analysis,” in IEEE NetSoft, 2016.
 [13] J. PradosGarzon, J. J. RamosMunoz, P. Ameigeiras, P. AndresMaldonado, and J. M. LopezSoler, “Modeling and Dimensioning of a Virtualized MME for 5G Mobile Networks,” IEEE Trans. on Veh. Tech., 2017.
 [14] OpenAirInterface: 5G software alliance for democratising wireless innovation. http://www.openairinterface.org.
 [15] D. Dietrich, C. Papagianni, P. Papadimitriou, and J. S. Baras, “Network function placement on virtualized cellular cores,” in COMSNETS, 2017.
 [16] F. Malandrino, C. F. Chiasserini, C. E. Casetti, and G. Landi, “OptimizationintheLoop for EnergyEfficient 5G,” in IEEE WoWMoM, 2018.
 [17] ETSI. (2017) Network Functions Virtualisation (NFV); Management and Orchestration. http://www.etsi.org/deliver/etsi_gs/NFVMAN/001_099/001/01.01.01_60/gs_NFVMAN001v010101p.pdf.
 [18] J. Mattingley and S. Boyd, “Cvxgen: A code generator for embedded convex optimization,” Optimization and Engineering, 2012.
 [19] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.
 [20] 5G Crosshaul Project. Deliverable D3.2: Final XFE/XCI design and specification of southbound and northbound interfaces. http://5gcrosshaul.eu/wpcontent/uploads/2018/01/5GCROSSHAUL_D3.2.pdf.
 [21] Cisco, “Cisco Visual Networking Index,” 2017.
 [22] Lagopus Project. It’s kind of fun to do the impossible with DPDK. https://www.slideshare.net/lagopus/dpdksummit2015itskindoffuntodotheimpossiblewithdpdk.