DeepAI
Log In Sign Up

Fog-supported delay-constrained energy-saving live migration of VMs over MultiPath TCP/IP 5G connections

The incoming era of the Fifth-Generation Fog Computing-supported Radio Access Networks (shortly, 5G FOGRANs) aims at exploiting computing/networking resource virtualization, in order to augment the limited resources of wireless devices through the seamless live migration of Virtual Machines (VMs) towards nearby Fog data centers. For this purpose, the bandwidths of the multiple Wireless Network Interface Cards (WNICs) of the wireless devices may be aggregated under the control of the emerging MultiPathTCP (MPTCP) protocol. However, due to fading and mobility-induced phenomena, the energy consumptions of current state-of-the-art VM migration techniques may still offset their expected benefits. Motivated by these considerations, in this paper, we analytically characterize, implement in software and numerically test the optimal minimum-energy Settable-Complexity Bandwidth Manager (SCBM) for the live migration of VMs over 5G FOGRAN MPTCP connections. The key features of the proposed SCBM are that: (i) its implementation complexity is settable on-line on the basis of the target energy consumption-vs.-implementation complexity tradeoff; (ii) it minimizes the network energy consumed by the wireless device for sustaining the migration process under hard constraints on the tolerated migration times and downtimes; and, (iii) by leveraging a suitably designed adaptive mechanism, it is capable to quickly react to (possibly, unpredicted) fading and/or mobility-induced abrupt changes of the wireless environment without requiring forecasting. The actual effectiveness of the proposed SCBM is supported by extensive energy-vs.-delay performance comparisons, that cover: (i) a number of heterogeneous 3G/4G/WiFi FOGRAN scenarios; (ii) synthetic and real-world workloads; and, (iii) MPTCP and SinglePathTCP (SPTCP) wireless connections.

READ FULL TEXT VIEW PDF

page 1

page 4

page 7

page 8

page 11

page 22

01/24/2021

SLA-Aware Multiple Migration Planning and Scheduling in SDN-NFV-enabled Clouds

In Software-Defined Networking (SDN)-enabled cloud data centers, live mi...
11/17/2021

CAMIG: Concurrency-Aware Live Migration Management of Multiple Virtual Machines in SDN-enabled Clouds

By integrating Software-Defined Networking and cloud computing, virtuali...
11/17/2021

Efficient Large-Scale Multiple Migration Planning and Scheduling in SDN-enabled Edge Computing

The containerized services allocated in the mobile edge clouds bring up ...
03/24/2022

Downtime Optimized Live Migration of Industrial Real-Time Control Services

Live migration of services is a prerequisite for various use cases that ...
05/21/2019

Application Components Migration in NFV-based Hybrid Cloud/Fog Systems

Fog computing extends the cloud to the edge of the network, close to the...
11/27/2018

Adaptive Edge Process Migration for IoT in Heterogeneous Cloud-Fog-Edge Computing Environment

The latency issue of the cloud-centric IoT management system has motivat...

1 Introduction

Smartphones are already our ubiquitous technological assistants. Since 2011, the worldwide smartphone penetration overtook that of wired PCs by reaching 80% in US and Europe. Cisco envisions that the average number of connected mobile devices per person will reach 6.6 in 2020, mainly fostered by the expected pervasive use of Internet of Everything (IoE) applications 1. This will be also due to the fact that future smartphones will increase their capability to use in an efficient way the multiple heterogeneous Wireless Network Interface Cards (WNICs) that already equip them by exploiting an emerging bandwidth-pooling transport technology, generally referred to as MultiPath TCP (MPTCP) 2.

However, since smartphones are (and will continue to be) resource and energy limited, Mobile Cloud Computing (MCC) is expected to effectively support them by providing device augmentation 3. MCC is, indeed, a quite novel technology that relies on the synergic cooperation of three different paradigms, namely, cloud computing, mobile computing and wireless Internet. Its main goal is to cope with the inherent resource limitation of the mobile devices by allowing them to offload computation and/or memory-intensive applications (such as, for example, image processing, voice recognition, online gaming and social networks) towards virtualized data centers 3. However, performing application offloading from mobile devices to remote large-scale cloud-based data centers (e.g., Amazon and Google, to name a few) suffers from the large communication delay and limited bandwidth typically offered by multi-hop cellular Wide Area Networks (WANs) 3. The dream is, indeed, to allow mobile devices to: (i) exploit their multiple WNICs to perform bandwidth aggregation through MPTCP 2; and, then: (ii) leverage the sub-millisecond access latencies promised by the emerging Fifth Generation (5G) multi Radio Access Networks (RANs) 4, in order to perform seamless application offloading towards proximate virtualized data centers, generally referred to as Fog nodes 1. For this purpose, the application to be offloaded from the mobile device is shipped in the form of a Virtual Machine (VM) and, then, the offloading process takes place as a live (e.g., seamless) VM migration from the mobile device to the serving Fog node 5.

1.1 Convergence of Fog, 5G and MPTCP: the reference 5G FOGRAN technological platform

The reason why the convergence of the (aforementioned) three pillar paradigms of Fog Computing, 5G Communication and MPTCP is expected to enable fast seamless migrations of VMs stems out from their native features, that are, indeed, complementary (see Table 1 for a synoptic win-to-win overview).

Fog 5G MPTCP
Edge location and context awareness Lasting one-hop from the served devices, Fog nodes may exploit context-awareness for the support of delay-sensitive VM migration Ultra-low access latency Sub-millisecond access latencies are obtained by combining massive MIMO, millimeter-wave, micro-cell and interference-suppression technologies Bandwidth aggregation By utilizing in parallel the available bandwidth of each path, MPTCP connections may provide larger throughput than SPTCP connections
Pervasive deployment Fog nodes are a capable to support pervasive services, such as community services On demand resource provisioning Wireless bandwidth is dynamically provided an on-demand and per-device basis Robustness Path failure is recovered by shifting the traffic from the failed path to the remaining active ones
Virtualized environment Software clones of the served wireless devices are dynamically bootstrapped onto the Fog servers as VMs Resource multiplexing and isolation Processing of the received flows is performed by a centralized Network Processor, that performs resource multiplexing/isolation on a per-flow basis through NFV Backward compatibility MPTCP connections utilize the same sockets of the standard SPTCP ones. Hence, SPTCP-based applications may continue to run unchanged under MPTCP
Mobility support Fog nodes may exploit single-hop short-range IEEE802.11x links for enabling VM migrations from the mobile devices Multi-RAT support Multiple short/long-range WiFi/cellular networking technologies are simultaneously sustained, in order to support multi-NIC wireless devices Load balancing By a balanced splitting of the overall traffic over multiple paths, energy saving may be attained
Support for live VM migration Wireless devices may seamlessly offload to serving Fog nodes their running applications in the form of VMs
Energy efficiency Wireless devices may save energy by exploiting nearby Fog nodes as surrogate servers
Table 1: Native features of the pillar Fog, 5G and MPTCP paradigms and their synergic interplay.

Fog Computing (FC) is formally defined as a model for enabling the pervasive local access to a centralized pool of configurable computing/networking resources that can be rapidly provisioned and released in an elastic (e.g., on-demand) basis. Proximate resource-limited mobile devices access these facilities over a single-hop wireless access network, in order to reduce the communication delay and, then, support delay and delay-jitter sensitive VM migrations. As pointed in the first column of Table 1, the main native features retained by the Fog paradigm are 1: (i) edge location and context awareness; (ii) pervasive spatial deployment; and, (iii) support for mobile energy-saving live migration of VMs.

Thank its sub-millisecond network latencies, the 5G Communication paradigm provides the “right” means to support the wireless access to the virtualized resources hosted by Fog nodes. In fact, the 5G paradigm retains, by design, the following main native features (see the second column of Table 1) 4: (i) ultra-low access latencies; (ii) on-demand provisioning of the wireless bandwidth; (iii) multiplexing, isolation and elastic scaling of the available physical communication resources; and, (iv) support for heterogeneous Radio Access Technologies (RATs) and multi-NIC mobile devices.

Thank to its support of multiple RATs, the 5G paradigm is the ideal partner of the MPTCP, whose native goal is to provide multi-homing (e.g., radio bundling) to mobile devices equipped with multiple heterogeneous WNICs. This means that, by design, MPTCP supports simultaneous data paths over multiple radio interfaces under the control of a single Transport-layer connection. Doing so, MPTCP attains (see the third column of Table 1) 2: (i) bandwidth aggregation; (ii) robustness against mobility and/or fading-induced connection failures; (iii) backward compatibility with the legacy Single-Path TCP (SPTCP); and, (iv) load balancing of the migrated traffic.

The convergence of these three paradigms leads to the 5G FOGRAN technological platform of Fig. 1, that constitutes the reference scenario of this paper.

Figure 1: The considered virtualized scenario over a 5G access network. VHW = Virtualized HardWare; NIC = Network Interface Card; VM = Virtual Machine; BS = Base Station; HMD = Heterogeneous Mo-Demodulators; MUX = MUltipleXer; DEMUX = DEMUltipleXer; HNI = Hardware Network Infrastructure; HSI = Hardware Server Infrastructure; HOS = Host Operating System; RAN = Radio Access Network; VLY = Virtualization LaYer; VNIC = Virtual Network Interface Card; VCL = Virtual Clone; GOS = Guest Operating System; = Wired Connection.

According to the emerging FOGRAN paradigm 6, 7, 8, in this scenario, a mobile device that is equipped with the MPTCP protocol stack establishes multiple connections to the serving Fog node by turning-ON and using in parallel its native WNICs (see the device in the top part of Fig. 1). Since the MPTCP is backward compatible with the SPTCP, a mobile device equipped with a single WNIC may continue to use legacy SPTCP connections (see the device in the bottom part of Fig. 1). The serving Fog node employs hypervisor-based virtualization technology, in order to deliver computational resources in the form of a VM that runs atop a cluster of networked physical servers hosted by the Fog node 3. The hosted VM and the hosting Fog node communicate through a Virtual NIC (VNIC), that typically emulates in software an Ethernet switch 5. According to the FOGRAN paradigm, the receive antennas of Fig. 1 (also referred to as Remote Radio Heads (RRHs)) perform only the Radio Frequency (RF) processing of the received data streams (e.g., power amplification, A/D and D/A conversions and sampling), while: (i) mo/demodulation and co/decoding base-band operations; (ii) MAC multiplexing/de-multiplexing and forwarding of the data flows; (iii) Network-layer routing; and, (iv) elastic resource allocation of the available communication bandwidth and resources, are collectively performed by the 5G Network Processor of Fig. 1. Specifically, according to the 5G paradigm (see Table 1), the 5G Network Processor multiplexes among the migrating VMs the available network physical resources by: (i) performing Network Function Virtualization (NFV); and, then, (ii) instantiating Virtual Base Stations (VBSs) atop the underlying 5G physical network (see Section VIII of 8). The connection between the distributed RRHs and the centralized 5G Network Processor is provided by the front-haul section of Fig. 1, that is typically built up by ultra-broadband high-capacity optical-fiber channels 8.

In order to be robust against the connection failures (possibly) induced by wireless fading and/or device mobility, pre-copy migration 9 is the technique adopted by the wireless devices of Fig. 1 for performing VM offloading over the underlying 5G FOGRAN.

1.2 Tackled problem, main contributions and organization of the paper

By referring to the 5G FOGRAN scenario of Fig. 1, the topic of this contribution is the settable-complexity energy-efficient dynamic management of the MPTCP bandwidth for the QoS pre-copy live migration of VMs. The target is the minimization of the network energy wasted by each wireless device for the migration of own VMs under six hard constraints, which are typically induced by the desired implementation complexity-vs.-performance tradeoff.

Specifically, the first constraint regards the settable implementation complexity of the proposed bandwidth manager. In order to (shortly) introduce it, we point out that the pre-copy migration technique requires, by design, that the memory footprint of the VM is migrated over a number of time rounds 9, whose transmission rates may be optimized in a more or less fine way, in order to save networking energy (see Fig. 4 in the sequel). A per-round optimization of the migration rates leads to the maximum saving of the network energy but also entails the maximum implementation complexity of the resulting bandwidth manager. Hence, the first constraint concerns the fact that, in order to limit the resulting implementation complexity, only a subset out of the available migration rates is allowed to be dynamically updated.

The second and third constraints limit in a hard way the total migration time and downtime (e.g., the service interruption time) of the migrating VM. These constraints allow to support delay and delay-jitter sensitive applications run by the migrating VM. The forth constraint limits the maximum bandwidth available for the VM migration and it is enforced by the bandwidth allocation policy implemented by the 5G Network Processor of Fig. 1. The fifth constraint upper bounds the maximum slowdown (e.g., the maximum stretching of the execution time) that is tolerated by the migrated application. Finally, the last constraint enforces the convergence of the iteration-based migration process by introducing a hard bound on the feasible speed-up factor, e.g., the minimum ratio between the volumes of data migrated over two consecutive rounds (see also Fig. 4 in the sequel).

As matter of these facts, we anticipate that the major contributions of this paper are the following ones:

  1. Being the MPTCP a novel paradigm that is not still fully standardized, various Congestion Control (CC) algorithms are currently proposed in the literature, in order to meet various application-depending tradeoffs among the contrasting targets of fairness, quick responsiveness and stable behavior 10. Hence, since the energy-vs.-transport rate profiles of the device-to-fog connections of Fig. 1 may depend also on the specifically adopted CC algorithm, we carry out a formal analysis of the power and energy consumptions of a number of MPTCP CC algorithms under the 5G FOGRAN scenario of Fig. 1. This is done for both cases of unbalanced and balanced MPTCP connections. The key result of this analysis is a unified parametric formula for the consumed dynamic power-vs.-transport rate profile that applies to all considered MPTCP CC algorithms, as well as to the newReno CC algorithm of the standard SPTCP.

  2. On the basis of the obtained power profile of the wireless MPTCP connections, we develop a related delay-vs.-energy analysis, in order to formally characterize the operating conditions under which VM migration over the 5G FOGRAN of Fig. 1 allows the mobile device to really save energy. Interestingly, the carried analysis jointly accounts for the CPU and WNIC energies, as well as for the migration time, computing time and migration bandwidth.

  3. By leveraging the obtained formulae for the energy-vs.-transport rate profile of the MPTCP wireless connections, we pass to formalize the problem of the minimum-energy live migration of VMs over the 5G FOGRAN of Fig. 1. For this purpose, we formulate the problem of the minimum energy consumption of the wireless device under limited migration time and downtime in the form of settable-complexity (non-convex) optimization problem, e.g., the minimum-energy Settable Complexity Bandwidth Manager (SCBM) optimization problem. Its formulation is general enough to embrace both MPTCP and SPTCP-based 5G FOGRAN scenarios, as well as in-band and out-band live migrations of VMs.

  4. The state of the utilized MPTCP connection of Fig. 1 may be subject to fast (and, typically, unpredictable) fluctuations, that may be induced by: (i) channel fading; (ii) device mobility; and: (iii) time-varying behavior of the migrating application. Hence, motivated by this consideration, we develop an adaptive version of the resulting SCBM, which is capable to quickly react to the fluctuations of the state of the underlying MPTCP connection without requiring any form of (typically, unreliable and error-prone) forecasting.

  5. We present the results of extensive numerical tests, in order to check and compare the actual energy-vs.-migration delay performances of the proposed adaptive MPTCP-based SCBM. Overall, we anticipate that the carried out tests lead to two first main insights. First, the implementation complexity of the SCBM increases in a linear way with the (aforementioned) number of updated transmission rates. At the same time, both the energy consumptions and times of convergence to the steady-state decrease for increasing values of and approach their global minima for values of quite low and typically limited up to . This supports the conclusion that the proposed adaptive SCBM is capable to attain a good performance-vs.-implementation complexity tradeoff. Second, in the carried out tests, the energy savings of the proposed SCBM over the state-of-the-art one in 9 currently implemented by legacy Xen and KVM-based hypervisors 11, 12 are typically over 25%, and reaches 70% under strict limits on the allowed downtimes.

A last contribution concerns the MPTCP-vs.-SPTCP energy performances under the considered hard constraints on the migration times and downtimes. In principle, due to the simultaneous powering of multiple WNICs, the network power required to sustain a MPTCP connection is (obviously) larger than the one needed by the corresponding SPTCP connection. However, due to the bandwidth aggregation effect, the opposite conclusion hold for the resulting migration time. Hence, since the migration energy equates the power-by-migration time product, an interesting still open question 13, 14 concerns the actual energy reductions possibly attained by MPTCP paradigm over the SPTCP one under the migration scenario of Fig. 1. At this regard, we point out that:

  1. a last set of carry out numerical tests stress that the energy savings of the MPTCP connections with respect to the corresponding SPTCP ones may be significant in supporting the proposed SCBM. Specifically, the tested savings may reach 30% and 70% with respect to SPTCP setups that utilize only WiFi and 4G connections, when the tested migration scenario meets at least one of the following operating conditions: (i) the size of the migration VM is quite large (e.g., we say, of the order of some tens of Megabits); (ii) the tolerated downtimes are low (e.g., less than 200 ms); (iii) the dirty rate (that is, the rate of memory-write operations) of the migrated application is not negligible (e.g., the dirty rate-to-available migration bandwidth ratio is larger than 10%).

The rest of this paper is organized as follows. After a review of the main related work of Section 2, in Section 3, we shortly summarize the basic features of the pre-copy live migration, in order to point out its inherent failure recovery capability. By leveraging on the (aforementioned) unified characterization of the power-vs.-rate profile of the currently considered MPTCP CC algorithms of Section 4, in Section 5, we develop a delay-vs.-energy analysis, that aims to formally feature the operating conditions under which VM migration should be performed by the wireless device. Sections 6 and 7 are devoted to the formal presentation of the afforded constrained minimum-energy SCBM optimization problem, its feasibility conditions and the pursued solving approach, respectively. Afterward, in Section 8, we develop the adaptive implementation of the proposed SCBM and discuss a number of related implementation aspects. Section 9 is devoted to the presentation of the carried out numerical tests and performance comparisons under synthetic/real-world applications and static/dynamic MPTCP wireless connections. The conclusive Section 10 reviews the main presented results and points out some hints for future research. Auxiliary analytical results are provided in the final Appendices.

Regarding the main adopted notation, the superscript:

denotes vector parameters, the symbol:

indicates a definition, is the natural logarithmic, while is the Kronecker’s delta (i.g., and , at and , respectively). Table 2 lists the main acronyms used in the paper.

Acronym Description
CC Cloud Computing
FC Fog Computing
MPTCP MultiPath TCP
PeCM Pre-copy Migration
PoCM Post-copy Migration
RAN Radio Access Network
RAT Radio Access Technology
SaCM Stop-and-Copy Migration
SCBM Settable Complexity Bandwidth Manager
SPTCP SinglePath TCP
VM Virtual Machine
WNIC Wireless Network Interface Card
Table 2: List of the main acronyms used in the paper.

2 Related work

During the last years, the utilization of live VM migration as a network primitive functionality for attaining resource multiplexing, failure recovery and (possibly) energy reduction in centralized/distributed wired/wireless environments received an increasing attention both from industry and academy. An updated review on the main proposed solutions, open challenges and research directions is provided, for example, in 5. However, at the best of the authors’ knowledge, the wireless bandwidth management problem afforded by this paper under the 5G FOGRAN scenario of Fig. 1 deals with a still quasi unexplored topic, as also confirmed by an examination of 5 and references therein. In fact, roughly speaking, the published work mostly related to the afforded topic embraces four main research lines, namely: (i) the management of the network resources for the live VM migration; (ii) the utilization of the MPTCP for the live VM migration; (iii) the analysis and test of the MPTCP throughput in wireless/mobile scenarios; and, (iv) the development of Middleware platforms for the support of live VM migration.

Regarding the first research line, the current solution for the management of the migration bandwidth implemented by state-of-the-art hypervisors (like, for example, Xen, VMware and KVM) is the heuristic one firstly proposed in 9

. According to this heuristic solution, the migration bandwidth is linearly increased over consecutive migration rounds up to its maximum allowed value. Since the (single) goal of this heuristic is to reduce the final downtime, it neglects at all any energy-related performance index and does not enforces any constraints on the total migration time and/or downtime. Both these aspects are, indeed, accounted for by the authors of the (recent) contribution in

15. However, this last contribution focuses only on intra-data center VM migrations over cabled Ethernet-type connections under legacy single-path newReno TCP. As a consequence, the solution in 15: (i) does not allow to tradeoff the implementation complexity versus the resulting energy performance; and, (ii) it implements, by design, an adaptive mechanism that mainly aims to attain stable behavior in the steady-state instead of quick reaction to the abrupt mobility-induced changes of the state of the underlying wireless connection. As consequence of these different design criteria pursued in 15, we anticipate that, in the considered wireless application scenario of Fig. 1, the energy reductions attained by the here proposed adaptive SCBM over the solution in 15 may be significant (e.g., up to 50% under some harsh wireless scenarios).

Recent contributions along the (previously mentioned) second research line are reported in 16, 17, 18. Specifically, the authors of 16 propose a nearby VM-based cloudlet for boosting the performance of real-time resource-stressing applications. In order to reduce the delay induced by the service initiation time, the authors of 16 develop an inter-cloudlet MPTCP-supported migration mechanism, that exploits the predicted mobility patterns of the served mobile devices in a pro-active way. Interestingly, almost zero downtimes are experienced at the destination cloudlets after the VM migrations are completed. The contribution in 17 proposes an adaptive queue-management scheduler, in order to exploit at the maximum the bandwidth aggregation capability offered by MPTCP when intra-data center live migration of VM must be carried out under delay-constraints. The topic of 18 is the exploitation of the aggregated bandwidth offered by MPTCP connections for the inter-fog wireless migration of Linux containers. For this purpose, a suitable version of the so-called Checkpoint/Restore In User-space (CRIU) migration technique is developed and its delay-performances are tested through a number of field trials. Overall, like our paper, the focus of all these contributions is on the effective exploitation of the MPTCP bandwidth aggregation capability, in order to reduce the VM migration times. However, unlike our paper, these contributions: (i) do not afford the energy-related aspects; and, (ii) do not consider the optimal management of the migration bandwidth under hard QoS-induced delay-constraints.

Testing the MPTCP throughput under wireless/mobile scenarios is the main topic of the contributions in 13, 14, 19, 20, 21. Specifically, the authors of 13 investigate the interplay between traffic partitioning and bandwidth aggregation under various MPTCP-supported wireless scenarios and, then, propose a per-component energy model, in order to quantify the energy consumed by the device’s CPU during the migration process. The focus of 19 is on the 3G/WiFi handover under MPTCP and related energy measurements, while 14 and 20 report the results of a number of tests on the delay-vs.-energy MPTCP performance under different flow sizes and mobility scenarios. Lastly, 21 proposes a power model for MPTCP and tests its energy consumption under different file sizes to be transferred. Overall, like our contribution, the common topic of all these papers is on the measurements of the power/energy performance of MPTCP connections under wireless/mobile outdoor scenarios. However, unlike our work, the focus of these papers is not on the live migration of VMs and/or on the effect of the various MPTCP CC algorithms on the energy performance of the underlying MPTCP connections.

Finally, 3 reports a quite comprehensive overview of a number of recent contributions, whose common topic is the development of software middleware platforms for the real-time support of traffic offloading. Among these contributions, we (shortly) review those reported in 22, 23, 24, 25, 26, 27, 28, 29, that mainly match the considered virtualized scenario of Fig. 1. A synoptic examination of these contributions allows us to enucleate the basic building blocks and related data paths that are shared by all them, as sketched by the reference virtualized architecture of Fig. 2.

Figure 2: The reference virtualized architecture considered for the middleware support of traffic offloading.

Hence, by referring to Fig. 2, the MAUI 22 and CloneCloud 23 frameworks develop and prototype in software two middleware platforms, that allow mobile devices to offload their virtualized tasks directly to public cloud data centers through cellular/WiFi connections. The common feature of the middleware virtualized platforms developed in 24, 25, 26, 27 is to be cloudlet-oriented. This means that the middleware layers of all these platforms rely on single-hop WiFi-based links, in order to perform fine/coarse-grained traffic offloading to nearby small-size data centers, generally referred to as cloudlets. The Mobile Network Operator Cloud (MNOC) is, indeed, the common focus of the contributions in 28, 29. These contributions consider a framework in which mobile devices exploit cellular 3G/4G connections, in order to offload their virtualized tasks to data centers that are directly managed by mobile network operators. Overall, like our work, all these contributions consider virtualized middleware-layer platforms for the resource augmentation of resource-limited wireless devices. However, unlike our work, the main focus of these contributions is on the design and implementation of middleware-layer software for the support of traffic migration, and they do not consider the problem of the delay-constrained and energy-efficient optimized management of the migration bandwidth.

3 Basic live migration techniques – A short overview

Live VM migration allows a running VM to be transferred between different physical machines without halting the migrated application. In principle, there are four main techniques for live VM migration, namely, Stop-and-Copy Migration (SaCM), Pre-Copy Migration (PeCM), and Post-Copy Migration (PoCM). They trade-off the volume of migrated data against the resulting downtime. These techniques rely on the implementation of at least one of the following three phases 5:

  • Push phase: the source device transfers to the destination device the memory image (e.g., the RAM content) of the migrating VM over consecutive rounds. To ensure consistency, the memory pages modified (e.g., dirtied) during this phase are re-sent over multiple rounds;

  • Stop-and-copy phase: the VM running at the source device is halted and the lastly modified memory pages are transferred to the destination;

  • Pull phase: the migrated VM begins to run on the destination device. From time to time, the access to the memory pages still stored by the source device is accomplished by issuing page-fault interrupts.

By design, the SaCM technique utilizes only the stop-and-copy phase. This guarantees that the volume of the migrated data equates the memory size of the migrated VM, but it generally induces long downtimes 5. The PeCM technique implements both the push and stop-and-copy phases of the migration process. It guarantees finite migration times, tolerable downtimes and robustness against the (possible) failures of the communication link. However, it induces overhead in the total volume of the migrated data, which may be substantial under write-intensive applications 5. The PoCM technique is composed by the stop-and-copy and pull phases of the migration process. Since only the I/O and CPU states of the source device are transferred to the destination device during the initial stop-and-copy phase, the experienced downtime is limited and no data overhead is induced. However, the resulting total migration time is, in principle, undefined and, due to the page-fault interrupts issued during the pull phase, the slowdown experienced by the migrated application may be substantial. We anticipate that the optimal bandwidth manager developed in this paper may be applied under all the mentioned migration techniques. However, in order to speed up its presentation, in the sequel, we focus on the PeCM as reference technique. The main reasons behind this choice are that: (i) PeCM is the migration technique currently implemented by a number of commercial hypervisors, such as, Xen, VMware and KVM; and, (ii) the bandwidth management framework of the PeCM technique is general enough to embrace those featured by the SaCM and PoCM techniques.

Figure 3: The six stages of the Pre-Copy Migration (PeCM) technique.

From a formal point of view, the PeCM technique is the cascade of the six stages reported in Fig. 3, namely:

  1. Pre-migration: the VM to be migrated is built up on the source device and the destination machine is selected on the destination server. This stage spans seconds.

  2. Reservation: the computing/communication/storage/memory physical resources are reserved at the destination server by instantiating a large enough VM container. (s) is the duration (in seconds) of this stage.

  3. Iterative pre-copy: this stage is composed by rounds and spans seconds. During the initial round (e.g., at round #0), the full memory content of the migrating VM is sent to the destination server. During the subsequent rounds (e.g., from round #1 to round #), the memory pages modified (e.g., dirtied) during the previous round are re-transferred to the destination server (see Fig. 4).

  4. Stop-and-copy: the migrating VM is halted and a final memory-copy round (e.g., round #) is performed (see Fig. 4). This last round spans seconds.

  5. Commitment: the destination server notifies that it has received a right copy of the migrated VM. (s) is the duration of this stage.

  6. Re-activation: the I/O resources and IP address are re-attached to the migrated VM on the destination server. (s) is the needed time.

Figure 4: An illustrative time-chart of the iterative pre-copy scheme of the PeCM technique.

3.1 Formal definition of the migration delays

From a formal point of view, the total migration time (s) is the overall duration:

(1)

of the (aforementioned) six stages of Fig. 3, while the downtime:

(2)

is the time required for the execution of the corresponding last three stages. From an application point of view, in Eq. (1) is the time over which the source and destination servers must be synchronized, while in Eq. (2) is the period over which the migrating VM is halted and the user experiences a service outage. Let (Mb/s) be the transmission rate (measured at the Transport layer) during the third and fourth stages of the migration process, that is, the migration bandwidth. Since, by definition, only and depend on , while all the remaining migration times in Eqs. (1) and (2) play the role of constant parameters, in the sequel, we focus on the evaluation of the (already defined) stop-and-copy time and the resulting memory migration time , formally defined as:

(3)

Hence, is the time needed for completing the memory transferring of the migrating VM, e.g., the duration of the performed memory-copy rounds of Fig. 4.

Since the PeCM technique performs the iterative pre-copy of dirtied memory pages over consecutive rounds (see Fig. 4), let (Mb) and (s), , be the volume of the migrated data and the time duration of the -th migration round of Fig. 4, respectively. By definition, and are the memory size (Mb) of the migrating VM and the time needed for migrating it during the -th round, respectively (see the leftmost part of Fig. 4). Hence, after indicating by (Mb/s) the (average) memory dirty rate of the migrating application (e.g., the per-second average number of memory bits which are modified by the migrating application), from the reported definitions we have that:

(4)

with , and also:

(5)

with . As a consequence, we have that (see Eq. (3)):

(6)

and then (see Eq. (5)):

(7)

In order to speed up the paper readability, the main used symbols and their meanings are listed in Table 3.

Symbol Meaning
Number of WNICs equipping the wireless device
Maximum number of migration pre–copy rounds
Round index, with
Integer-valued path-index of the MPTCP connection, with
Integer-valued iteration index
 (Mb/s) Average memory dirty-rate of the migrated application
 (Mb/s) Migration bandwidth utilized by the MPTCP
 (W) Dynamic network power consumed by the MPTCP
 (Mb) Memory size of the migrated VM
 (s) Maximum tolerated migration time
 (s) Maximum tolerated downtime
 (Mb/s) Maximum migration bandwidth of the MPTCP
 (J) Total energy consumed by the mobile device for the parallel setup of its WNICs
 (J) Total setup-plus-dynamic network energy consumed by the mobile device for the VM migration
Table 3: Main symbols used in the paper and their meanings.

4 Power and energy analysis of MPTCP wireless connections

With reference to the 5G FOGRAN environment of Fig. 1, main goal of this section is to develop a formal analysis of the dynamic power-vs.-transport rate profile of the MPTCP congestions that accounts (in a unified way) for the effects of the various CC algorithms currently considered in the literature 10. We anticipate that the final formulae of the carried out analysis will be directly employed in Section 6, in order to formally define the objective function of the considered SCBM problem. In order to put the performed analysis under the right reference framework and speed-up its development, in the next two sub-sections, we shortly review some basic architectural features of the MPTCP protocol stack and the related CC algorithms.

4.1 A short review of some key features of the MPTCP protocol stack

Fig. 5 sketches the main components of the MPTCP-compliant protocol stack implemented by a wireless device that is equipped with heterogeneous WNICs 2. Each WNIC posses an own IP address at the Network layer111Both the IPv4 and IPv6 Internet protocols may run under the MPTCP layer of Fig. 5 2., with , , denoting the IP address of the -th WNIC. The MPTCP layer is a backward-compatible modification of the standard SPTCP that allows a single transport layer data flow (e.g., a single transport layer connection) to be split across the paths (also referred to as subflows) done available by the underlying Network layer. For this purpose, the Application and MPTCP layers communicate through a (single) socket that is labeled by single port number 2. After performing the connection setup through the exchange of SYN/ACK segments that carry out the MP_CAPABLE option, the running application opens the socket, which, in turn, starts the Slow-Start phase of a first TCP subflow. If required by the application socket, more sub-flows can be dynamically added (resp., removed) later by issuing the MP_JOIN/ADD_ADDR commands of the MPTCP protocol. Each opened sub-flow is pinned to an own WNIC and labeled by the corresponding IP address (see Fig. 5).

Figure 5: A sketch of the MPTCP protocol stack.

The MPTCP protocol uses two levels of sequence number for guaranteeing the in-order delivering of the transported segments, namely, a single connection-level sequence number and a set of subflow-level sequence numbers. The single connection-level sequence number is the data sequence number utilized by the running application. When the MPTCP sender starts to transmit in parallel over multiple subflows, each connection-level sequence number is suitably mapped onto a corresponding subflow sequence number. Doing in so, each subflow can send own data and process the corresponding ACK messages as a regular standing-alone SPTCP connection. The MPTCP receiver uses the connection-level sequence number for orderly reassembling the received subflows. An arrived data segment is marked as “in-sequence” if both its subflow and connection sequence numbers are in-order.

Out-of-sequence segments are temporarily stored by a connection-level buffer that equips the receive side of the MPTCP connection. This buffering introduces, in turn, harmful queue-induced delivering delays, that, in some limit cases, may fully offset the corresponding reduction in the transport delays gained by using MPTCP. Hence, out-of-sequence phenomena should be avoided as much as possible.

4.2 A synoptic comparison of the MPTCP Congestion Control algorithms

After exiting the setup phase, the state of the MPTCP connection enters the Slow Start (SS) phase. During this phase, each active subflow works independently from the others and applies the same SS algorithm as a regular SPTCP flow 2.

At the end of the SS phase, the MPTCP connection enters the Congestion Avoidance (CA) state, e.g., the steady-state. During this state, the behavior of the MPTCP connection is managed by the the corresponding CC algorithm, whose specific features make the MPTCP very different from the regular SPTCP. To begin with, the MPTCP sender implements and updates in parallel multiple mutually interacting Congestion WiNDows (CWNDs) for controlling the local traffic over each path, whilst the MPTCP receiver uses a single receive window 2. The (possibly, coupled) updating of the CWNDs at the sender side is dictated by the actually adopted CC algorithm. A family of different MPTCP CC algorithms have been proposed over the last years, each one featured by a specific tradeoff among the contrasting targets of fair behavior, quick responsiveness in the transient-state and stable behavior in the steady-state (see 10 for a formal presentation of this specific topic).

However, according to the analysis presented in 10, the basic control mechanism implemented by the currently proposed CC algorithms may be shortly described as follows. Let , , be the size (measured in Mb) of the CWND of the -th subflow and let:

(8)

be the resulting size of the total CWND. On each subflow , the MPTCP source increases (resp., decreases) the corresponding -th CWND by a number of Megabits equal to: (resp., ) at the return of each ACK message (resp., at the detection of each segment loss). These increments and decrements are dictated by the following general expressions 10:

(9)

for each ACK on the -th subflow, and

(10)

for each segment loss on the -th subflow.

After denoting by the round-trip-time (measured in seconds) of the -th subflow, let: (Mb/s) be the corresponding net transmission rate (e.g., the throughput), so that:

(11)

is the resulting total transmission rate (e.g., the total throughput) of the overall MPTCP connection. Hence, as detailed in Table 4 10, the specific feature of each CC algorithm depends on the way in which the increment/decrement terms in (9) and (10) are actually computed. For comparison purpose, the last row of Table 4 reports the basic CC features of the commodity NewReno SPTCP.

Acronym of the CC algorithm Expression of Expression of Auxiliary notations
EWTCP 30
Semicoupled MPTCP 31 ,
Max MPTCP 32
Balia MPTCP 10 ;  
NewReno SPTCP
Table 4: Expressions of the increments/decrements of the per-flow congestion windows of some MPTCP CC algorithms 10. SPTCP mantains, by design, a single subflow.

Detailed analysis and comparisons of the fairness/responsiveness/stability properties of the MPTCP CC algorithms of Table 4 are reported in 10 and, then, they will be not replicated here. We shortly limit to remark that: (i) all the considered CC algorithms react against per-subflow segment loss events by halving the current size of the corresponding per-subflow congestion window; (ii) the EWTCP algorithm applies the NewReno SPTCP increment policy on each subflow independently. For this reason, as also pointed out in 10 and 32, it is the most reactive in coping with the abrupt changes typically affecting the state of wireless/mobile MPTCP connections; (iii) the Semicoupled, Max and Balia algorithms introduce various forms of coupling in updating the sizes of the congestion windows, in order to be more fair (like as the Balia CC algorithm) and/or more stable in the steady-state (like as the Semicoupled CC algorithm). Hence, these last CC algorithms appear to be more suitable for managing wired (possibly, multi-hop) MPTCP connections (see 2 and Section V of 10).

4.3 Unified analysis of the power-vs.-rate performance of the MPTCP under 5G scenarios

The total network energy (J) consumed by the multiple WNICs equipping the wireless device of Fig. 1 during each VM migration process is the summation of a static (e.g., setup) part: (J), and a dynamic part: (J). By design, does not depend on the set of the per-subflow transmission rates and equates the summation:

(12)

of the setup energies needed by the involved WNICs to establish and maintain the MPTCP connection 14. The dynamic portion of the consumed energy depends on the per-subflow transmission rates in (11) through the corresponding total dynamic power (W), that, in turn, is given by the following summation:

(13)

where , , is the dynamic power consumed by the -th WNIC for sustaining its transport rate (see Eq. (11)). Now, the key point to be remarked is that the (aforementioned) MPTCP CC algorithms may introduce coupling effects, so that each may be a function of the overall spectrum: of transport rates, with the analytical form of the function that may depend, in turn, on both the subflow index and the actually considered CC algorithm. Interestingly, a suitable exploitation of the expressions of Table 4 allows us to arrive at the following unified formula for the profile of the total dynamic power in (13) as a function of transport rates (see the Appendix A for the derivation):

(14)

In Eq. (14), the expressions of , and depend on the considered CC algorithms. They are detailed in Table 5, where: (i) (Mb) is the maximum size of a MPTCP segment; (ii) is a dimensionless power exponent; (iii) the constant is given by the forth column of the above Table 4; and, (iv) , , represents the noise power-to-coding gain ratio of the transmission path sustained by the -th WNIC. According to Eq. (63) of the Appendix A, it is measured in () and it is formally defined as follows:

(15)

where

is the (dimensionless) segment loss probability experienced by the

-th subflow.

Acronym of the CC algorithm Expression of Expression of Expression of
EWTCP 30
Semicoupled MPTCP 31
Max MPTCP 32
Balia MPTCP 10
NewReno SPTCP
Table 5: Expressions of the parameters involved by the unified power-vs.-rate formula in (14) for the MPTCP.

Regarding the derived dynamic power-vs.-transport rate formula in (14), two main remarks are in order. First, it is composed by a linear superposition of power terms, each one involving the transport rate of a MPTCP subflow. This power-like form is compliant, indeed, with the results reported by a number of measurement/test-based studies 13, 14, 20, 21, 33, 34, 35. Second, the key features of our formula in (14) are that: (i) it holds in general, e.g., regardless from the specifically considered CC algorithm; and, (ii) its parametric form allows us to formally account in a direct way for the effect induced on the consumed power by the actually considered CC algorithm (see Table 5).

4.4 The case of load-balanced MPTCP connections

A direct inspection of Eqs. (14) and (15) leads to the conclusion that, when the subflows are load balanced, e.g.:

(16)

then, the power-rate formula in (14) reduces to the following monomial one:

(17)

In Eq. (17), is measured in and its formal expressions are reported in Table 6 for the (above considered) CC algorithms.

Acronym of the CC algorithm Expression of (W/(Mb/s))
EWTCP 30
Semicoupled MPTCP 31
Max MPTCP 32
Balia MPTCP 10
NewReno SPTCP
Table 6: Expressions of in Eq. (17) for the the case of load-balanced MPTCP.

The expression in (17) is central for the future developments of our paper, and merits, indeed, three main remarks.

First, from a formal point of view, an examination of the general expressions reported in the 3-rd column of Table 5 points out that the equal-balanced condition in (16) is met when the values assumed by the -indexed products: (resp., the products: ) do not depend on the subflow index under the EWTCP and Balia MPTCP (resp., under the Semicoupled and Max MPTCP). The measurement-based studies reported, for example, in 14, 19, 21, 32, 33, 34 support the conclusion that, in practice, these formal conditions are quite well met under 3G/4G cellular and WiFi connections, mainly because the average segment loss probability (resp., the average round-trip-time) of WiFi connections is typically one order of magnitude larger (resp., lower) than the corresponding one of 3G/4G cellular connections.

Second, an inspection of the expressions of Table 6 leads to the conclusion that, at , the power-coefficient in (17) scales down for increasing number of the available load-balanced subflows as:

(18)

In the sequel, we refer to the scaling down behavior in (18) as the “multipath gain” of the equal-balanced MPTCP. Intuitively, it arises from the fact that, at , the dependence of the dynamic power in (14) on each subflow transport rate is of convex-type, so that equal-balanced transport rates reduce the resulting total power consumption.

Third, unbalanced transport rates increase the chance of out-of-sequence segment arrivals at the receive end of the MPTCP connection, that, in turn, give arise to harmful queue-induced delivering delays (see the last remark of the above Section 4.1).

Overall, motivated by the above remarks, in the sequel, we directly focus on the equal-balanced case, and, then, we exploit the expression in (17) for the formal characterization of the resulting total migration energy. At this regard, we anticipate that the topic regarding the on-line (e.g., real-time) profiling of the and parameters present in (17) will be afforded in Section 8.1, whilst some additional remarks on the case of unbalanced MPTCP connections will be reported in Section 10.

5 To migrate or not to migrate – A delay-vs.-energy analysis

The two-fold task of the migration module at the mobile device of Fig. 2 is to plan: (i) “when” (e.g., under which operating conditions) performing the migration; and, (ii) “how” manage the planned migration. Although the focus of this paper is on the second question, in this section we shortly address the first one. At this regard, we observe that the final goal of the mobile-to-fog migration process would be to reduce as much as possible both the execution time of the migrating application and the corresponding energy consumed by the mobile device.

In the sequel, we develop a time-energy analysis of the mobile-to-fog migration costs under the considered 5G FOGRAN scenario of Fig. 1, in order to formally characterize the traded-off operating conditions under which the VM migration is worth.

In order to characterize and compare the local-vs.-remote execution times of the migrating VM, let be the ratio between the workload (bit) to be processed by the migrating VM and its (previously) defined size (bit). Hence, after indicating by (bit/s) the processing (e.g., computing) speed at the mobile device, the resulting time (s) for the local execution of the VM at the mobile device equates:

(19)

However, after indicating by (bit/s) the processing speed of the device clone at the Fog node of Fig. 1, when the migration is performed, the resulting execution time (s) at the Fog node is the summation:

(20)

of the total migration time in Eq. (1) and the execution time: , of the migrated VM at the Fog node. As a consequence, the VM migration would reduce the execution time when , that is, when the corresponding migration time in Eq. (3) meets the following upper bound:

(21)

where the operator in (21) accounts for the fact that, by definition, is non-negative.

Passing to consider the energy consumptions induced by the local and remote workload processing, let (Watt) be the power consumed by the mobile device when it runs at the (aforementioned) processing speed . Hence, the energy (J) wasted by the mobile device for the local execution of the workload equates:

(22)

The corresponding energy (J) consumed by the mobile device when the VM is migrated and remotely processed at the Fog node is the summation of three contributions. The first one is the already considered energy consumed by the device-to-fog migration. The second one accounts for the idle energy: (J) consumed by the mobile device in the idle state during the remote execution of the migrated workload at the Fog node. It equates: , where (W) is the power consumed by the mobile device in the idle state. The third component accounts for the energy: (J) consumed by the mobile device, when it receives the processed workload sent back by the Fog node. It equates: , where: (i) (W) is the network power consumed by all WNICs of the mobile device under the receive operating mode; (ii) the positive and dimensionless coefficient is the relative size of the processed data, so that the product: (bit) is the size of the processed data sent back by the Fog node; and, (iii) (bit/s) is the aggregated download bandwidth used by the Fog node for the fog-to-device data transfer over the 5G FOGRAN of Fig. 1. Overall, we have that:

(23)

Therefore, the VM migration is energy saving when: , that is, when the migration energy meets the following inequality (see Eqs. (22) and (23)):

(24)

where the operator in (24) accounts for the non-negativity of . As a consequence, VM migration is time (resp., energy) efficient when the memory migration time (resp., the migration energy ) meets the upper bound in Eq. (21) (resp., in Eq. (24)).

An examination of the bounds in (21) and (24) allow us to address three questions about “which” VM to migrate, “when” performing the migration and “how” manage the (planned) migration. At this regard, three main remarks are in order.

First, we observe that the (previously defined) parameters , and depend only on the migrating VM and the corresponding migrated application. Hence, about the question concerning “which” VM to migrate, the lesson supported by the bounds in (21) and (24) is that the migrated VM should offer large workload (e.g., large values of the product: ), but quite limited values of the size of the processed data sent back from the Fog node (e.g, low values of the coefficient in Eq. (24)).

Second, about the question concerning “when” performing the migration, the reported bounds point out that the 5G FOGRAN infrastructures which are more effective for supporting VM migration should be equipped with (very) speed computing servers (i.e., ) and broad Fog-to-mobile download bandwidths (i.e., large , see Eq. (24)). Furthermore, the mobile devices that receive more benefit from the migration process are those equipped with power-consuming slow CPUs (e.g., low processing speeds and high computing powers ; see Eqs. (21) and (24)) and power-efficient WNICs (e.g., low idle and receive powers and , respectively; see Eq. (24)). Typical values of , , , and for a spectrum of mobile devices and wireless access technologies are reported, for example, in 33, 34, 35.

Finally, about the question concerning “how” managing the planned VM migration, we observe that, in principle, under a given operating scenario (e.g., at fixed values of the bounds in Eqs. (21) and (24)), task of the migration module of Fig. 2 is to manage the device-to-Fog migration bandwidth of Fig. 1, in order to reduce as much as possible both the memory migration time and migration energy in Eqs. (21) and (24). However, an examination of the (previously reported) Eq. (6) points out that lower values of require higher migration bandwidths that, in turn, give arise to higher migration energies. This forbids to pursue the simultaneous minimization of and . Hence, after observing that stretching as much as possible the battery life of the wireless devices remains a main target of the 5G paradigm, in the sequel, we approach the question concerning “how” managing the planned VM migration by formulating and solving a settable-complexity constrained optimization problem that dynamically minimizes at run-time the migration energy , while simultaneously upper bounding (in a hard way) the resulting memory migration time in (6). In principle, the resulting SCBM provides the (formally optimal) response to the question about “how” manage the planned migration in an energy and time efficient way.

6 The afforded settable-complexity minimum-energy bandwidth optimization problem

We pass, now, to formalize the afforded SCBM constrained optimization problem. The idea behind the proposed SCBM is quite simple. By referring to Fig. 4, in addition to and , we update out of the rates of the pre-copy rounds that are evenly spaced apart by rounds over the round index set: . For this purpose, we perform the partition of the round index set: into not overlapping contiguous clusters of size , as shown in Fig. 6.

Figure 6: A sketch of the main idea behind the proposed SCBM.

The first rate , , of each cluster is updated, while the remaining rates are set to: , that is , for . Fig. 6 illustrates the framework of the updated/held migration rates for the case of and . In this example, and are the: migration rates to be updated, while and are the: migration rates which are held, e.g., , and .

In order to formally introduce the SCBM, let be the maximum number of the performed pre-copy rounds, so that the overall set of the migration rates is (see Fig. 6): . Let be the integer-valued number of the pre-copy migration rates we select to update and let be the resulting integer-valued size of the rate clusters (see Fig. 6). Formally speaking, and are to be selected according to the following two rules:

  • if , must be integer-valued and falling into the interval: . Furthermore, must be selected so that the resulting ratio: is also integer-valued;

  • if , the set of the pre-copy rates is the empty one, so that we must pose: and .

Under these settings, the proposed SCBM is formally defined as follows:

  • the SCBM updates the following set of migration rates: ;

  • the SCBM sets the remaining migration rates as follows: , , and: .

Doing so, the expression in Eq. (2) for the downtime becomes:

(25)

Furthermore, the resulting total migration time of Eq. (6) assumes the following closed-form expression under the SCBM:

(26)

As a consequence, the expression in Eq. (17) for the total energy wasted by the SCBM reads as in:

(27)

Interestingly, the reported expressions in (25), (26) and (27) for the SCBM depend only on and the migration rates to be updated (see the above defined set ).

6.1 The afforded QoS minimum-energy bandwidth optimization problem

In order to properly account for the metrics commonly considered for measuring the QoS of live VM migrations, we introduce four sets of hard constraints in the formulation of the minimum-energy SCBM optimization problem.

The first two constraints enforce QoS-dictated upper bounds: (s) and (s) on the tolerated migration time and downtime , respectively. Hence, they read as:

(28)

and

(29)

where and depend on the per-round aggregated migration rates of the underlying (equal-balanced) MPTCP connection (see Eqs. (25) and (26)).

The third constraint arises from the consideration that, in principle, the pre-copy stage in Fig. 3 of the PeCM technique could run indefinitely, if suitable stop conditions are not imposed. Commonly considered stop conditions account for 9: (i) the maximum number of the allowed migration rounds of Fig. 4; and, (ii) the maximum tolerated value: of the ratio of the migrated data over two consecutive rounds. Hence, since the above constraints in Eqs. (28) and (29) already account for through the corresponding expressions of and in Eqs. (25) and (26), we leverage Eq. (4), in order to formulate the constraints on the ratio in the following form:

(30)

where denotes the aggregated bandwidth done available by the considered MPTCP connection during the -th migration round of Fig. 3.

The last constraint accounts for the maximum uplink bandwidth that the 5G Network Processor of Fig. 1 allocates to the requiring device for the VM migration. In principle, depending on the bandwidth allocation policy actually implemented by the 5G Network Processor, the bandwidth assigned to the device could be exclusively allotted for the migration process (e.g., out-band migration) or it could be shared with the migrating application through statistical multiplexing (in-band migration; see 15). In the first case, the bandwidth costraint reads as in: , whilst, in the second case, we have: , where: (i) (Mb/s) is the uplink bandwidth that the 5G Network Processor of Fig. 1 reserves for the exclusive support of the migration process; (ii) (Mb/s) is the total aggregate bandwidth allocated by the 5G Network Processor for both the migration of the VM and the support of the migrating application; and, (iii) is the (dimensionless) fraction of the aggregate bandwidth that is reserved by the wireless device for the migration process. Hence, after introducing the auxiliary position:

(31)

the following set of bandwidth constraints:

(32)

apply both cases of in-band (e.g., ) and out-band (e.g., ) migration over the 5G FOGRAN of Fig. 1.

Overall, as a matter of these considerations, the afforded SCBM constrained optimization problem is formally stated as follows:

(33)

with given by Eq. (27) under the considered MPTCP-supported SCBM.

In the last part of this section, we discuss some possible generalizations and refinements of the stated SCBM optimization problem.

Accounting for wireless connection failures

According to the time chart of Fig. 3 and the related text, PeCM guarantees, by design, that the migrating VM continues to run on the wireless device till the end of the Commitment phase, so that the possible occurrence of failures of the device-to-fog MPTCP connection do not cause service interruption 9. This inherent robustness of the PeCM technique against connection failure events makes it the preferred candidate for performing VM migration in failure-prone wireless environments 5. However, nothing is for free, so that connection failure events still waste energy resources of the wireless device. In order to formally quantify the (average) energy loss caused by failures of the used MPTCP connection, we need to characterize the connection failure probability: , formally defined as the probability that a connection failure event occurs during the migration interval, e.g., over the time: . In general, in the considered mobile Fog scenario of Fig. 1, this probability may depend of a (large) number of (possibly, unpredictable) factors, such as, mobility speed and mobility trajectory of the device, radio coverage radius, utilized inter-cell handover mechanism, statistics of the wireless fading and so on, just to cite a few. However, the formal general analysis carried out in 36 supports the conclusion that the failure probability of a mobile connection is typically well described by the following Pareto-like expression:

(34)

In the above relationship, we have that: (i) is a dimensionless non-negative shaping factor, that accounts for the handover and/or mobility-induced heavy-tailed behavior of the connection failure probability; and, (ii) (s) is the maximum expected duration of an on-going MPTCP connection. Although it may be hard to develop general computing formulae for these parameters, we note that, in our framework, they may be profiled on-line by the 5G Network Processor of Fig. 1, which typically records the statistics of the sustained connections 8. Therefore, after profiling the connection failure probability and under the worst-case assumption that all the already migrated data are lost when a connection failure event happens, the resulting failure-induced energy loss suffered by the wireless device reads as the following product:

(35)

where is the (possibly, profiled) average number of VM migrations attempted by the wireless device of Fig. 1 during the time interval , and is the average per-migration energy consumed by the wireless device.

Accounting for time-varying dirty rates of the migrating application

Let us consider the case in which the migrating application performs so many memory writing operations that the resulting dirty-rate changes during the migration time. Hence, after indicating by: (Mb/s), and the dirty rate over the -th migration round and the resulting maximum dirty rate respectively, let us introduce the following dummy positions: , for , and . Hence, after replacing: , and by: , and respectively, the resulting formulation of the considered bandwidth optimization problem directly applies to the case of time-varying dirty rates.

Accounting for the stretching of the execution times and memory compression of the migrating VM

During the execution of in-band VM migrations, the (previously introduced) aggregated bandwidth available at the wireless device of Fig. 1 is partially utilized for migration purpose, so that the execution speed of the corresponding migrating application may decrease due to bandwidth contention phenomena 37. In order to quantify the resulting stretching of the execution time of the migrating application, let and be the (average) execution times of the considered application in the presence/absence of VM migration, respectively. Hence, the queue analysis reported in 37 leads to the conclusion that the execution time stretching ratio: scales as

(36)

where is the (previously defined) fraction of the overall aggregated bandwidth that is reserved by the device for the VM migration. The above stretching ratio may be lowered by reducing the size of the migrated VM through compression coding (e.g., delta coding, run-length coding and similar). Hence, in order to simultaneously account for the contrasting effects on of both compression coding and ARQ/FEC-based error-protection coding, let (Mb) be the size of the uncompressed and uncoded VM, and let: , and be the compression ratio and the overall coding rate of the ARQ/FEC-based error protection mechanisms implemented at both the MAC and Physical layers of the wireless device. Hence, the actual size of the VM to be migrated is directly computed as follows:

(37)

7 Feasibility conditions and optimized tuning of the maximum number of migration rounds

Regarding the feasibility conditions of the stated bandwidth optimization problem, we observe that the involved constraint functions and in Eqs. (28) and (29) strictly decrease for increasing value of the aggregate transport rate . therefore, as a quite direct consequence, the following formal result holds. The SCBM optimization problem in Eq. (33) is feasible if and only if the following three conditions are simultaneously met:

(38)
(39)

and,

(40)

Interestingly, the above feasibility conditions lead to some first insights about the effects of the parameters of , and on the (expected) behavior of the resulting SCBM. At this regard, three main remarks are in order. First, at fixed , the expressions at the l.h.s. of Eqs. (38), (39) and (40) strictly increase (resp., decrease) for increasing values of the dirty rate (resp., the maximum transport rate ). Second, for increasing values of , the function at the l.h.s. of Eq. (39): (i) remains unchanged; (ii) decreases; and, (iii) increase, at , , and , respectively. Third, for increasing values of , the l.h.s. of Eq. (38) increases, regardless from the value assumed by the ratio . The consequences are that: (i) increasing values of are welcome, because they always reduce both the resulting total migration time and downtime; but, (ii) increasing values of lower the resulting downtimes at , whilst increase them at .

These considerations open the doors to the question regarding the optimized setting of . Unfortunately, this is still an open question even in the case of state-of-the-art hypervisors, whose typically adopted application-oblivious default setting 9, 12: does not guarantee, indeed, the convergence of the iterative pre-copy migration process and does not assure the minimization of the overall consumed energy (see, for example, 5, 38 and references therein). However, we anticipate that the carried out tests support the conclusion that, under the following setting of :

(41)

the resulting total migration energy wasted by the proposed SCBM typically attains its minimum. Intuitively, this is due to the fact that the expression in (41) is obtained by calculating the value of that meets the feasibility constraint in (39) with the strict equality. Hence, under the setting in (41), the maximum tolerated downtime is fully exploited by the SCBM. This leads to a reduction of the average transport rate utilized during the overall migration process, that lowers, in turn, the resulting total migration energy. On the basis of this consideration, in the sequel, we refer to in (41) as the optimized setting of under the proposed SCBM.

8 Proposed Settable-Complexity Bandwidth Manager and related implementation aspects

From a formal point of view, the energy function in (27) is a superposition of power-fractional terms of the type , which involve the rate variables. Hence, is not a convex function in the transmission rates to be optimized, so that the resulting optimization problem in (33) is not a convex optimization problem. However, since all the rate variables to be optimized are, by design, non-negative, we may introduce the following -transformations:

(42)

so that:

(43)

Furthermore, after collecting the log-rates in (42) in the following -dimensional (column) vector: