Probabilistic QoS-aware Placement of VNF chains at the Edge

06/01/2019
by   Antonio Brogi, et al.
1

Network Function Virtualisation and Software Defined Networking are innovating the way network services are provisioned, by introducing a level of flexibility that is key for coping with requirements of complex traffic patterns in modern networking systems, such as the Internet of Things (IoT). Network services may be provisioned to cope with IoT traffic requirements by deploying Virtual Network Functions (VNFs) chains over virtualised infrastructures. The availability of Edge resources may be exploited to cope with IoT traffic latency and bandwidth requirements but, on the other hand, it may introduce variations in the available infrastructure. In this paper, we present a declarative solution, EdgeUsher, to the problem of how to best place VNF chains onto Cloud-Edge infrastructures. EdgeUsher prototype can determine all eligible placements for a set of VNF chains onto an Edge infrastructure so to satisfy all the hardware, IoT, security, bandwidth, and latency requirements associated with the input VNF chains. It exploits probability distributions to model the dynamic behaviour of Edge infrastructures, and it correspondingly ranks the set of valid placements according to such probabilities. EdgeUsher also returns, for each placement, the set of function-to-function paths that can be exploited to route traffic flows through the deployed chain.

READ FULL TEXT VIEW PDF
06/08/2018

A Publish/Subscribe QoS-aware Framework for Massive IoT Traffic Orchestration

Internet of Things (IoT) application deployment requires the allocation ...
01/07/2019

Dynamic and Application-Aware Provisioning of Chained Virtual Security Network Functions

A promising area of application for Network Function Virtualization is i...
10/08/2018

A Comprehensive Study on Load Balancers for VNF chains Horizontal Scaling

We present an architectural design and a reference implementation for ho...
11/23/2018

Costless: Optimizing Cost of Serverless Computing through Function Fusion and Placement

Serverless computing has recently experienced significant adoption by se...
05/07/2018

Unified Management and Optimization of Edge-Cloud IoT Applications

Internet of Things (IoT) applications have seen a phenomenal growth with...
01/15/2021

Quantitative System-Level Security Verification of the IoV Infrastructure

The Internet of Vehicles (IoV) equips vehicles with connectivity to the ...
10/28/2020

SD-Access: Practical Experiences in Designing and Deploying Software Defined Enterprise Networks

Enterprise Networks, over the years, have become more and more complex t...

Code Repositories

EdgeUsher

A declarative prototype to solve the VNF placement in Cloud-Edge scenarios.


view repo

1 Introduction

New Edge computing [abbas2018mobile] infrastructures aim at offering suitable support for Internet-of-Things (IoT) applications, especially when applications must meet stringent QoS communication requirements (e.g., latency, bandwidth, security) or handle large amounts of data. To achieve this goal, such new distributed infrastructures rely on computing capabilities which are closer to the edge of the Internet and to where data is produced and consumed, e.g. personal devices, access-points, smart network gateways, base-stations, switches, and micro-datacentres.

At the same time, networks technologies are evolving to support such paradigm shift and to enable the delivery of customised end-to-end services [ye2019end]. Particularly, to make a more effective usage of network resources and to contain deployment costs, technologies such as Software-Defined Networking (SDN) and Network Function Virtualisation (NFV) are used more and more often in production environments [nguyen2017sdn]. On one hand, SDN permits decoupling control functions from actual data forwarding, by allowing programming traffic flows from a (logically) centralised point. On the other hand, NFV permits realising custom Virtual Network Function (VNF) chains, having each function implemented as a software application that can be installed on commodity hardware, Virtual Machines or containers. Both NFV and SDN are considered by many as a promising approach for supporting IoT applications in Edge and Fog environments (e.g., [Massonet2017, Farris2019]) and flexibly matching diverse IoT traffic requirements [Wang:IoT:2018], ranging from low-latency and deployment costs minimization [Wang:IoT:2018, Leivadeas2019] to security mechanisms coping with threats in extremely dynamic IoT environments [Farris2019]. The concept of Service Function Chaining is also gaining interest as a paradigm allowing the flexible and dynamic provisioning of applications and network services in the IoT [Morabito:2017] and, more generally, in hybrid Cloud-Edge environments [mouradian2018application]. For instance, such technologies are expected to be key enablers for offering added-value services, like virtual reality or tactile Internet applications, over next-generation telecommunication networks [Cziva2018] and in Multi-access Edge Computing (MEC) scenarios [taleb2017multi, etsiMEC003], which include various latency-sensitive applications (e.g., video stream analysis [etsiMECscenarios]).

In this context, as highlighted in [alhussein2018joint], to fully release the potential of combining Edge and Cloud computing paradigms with new generation networks, novel models and methodologies should be devised in support of decision making when deploying VNF chains that enable IoT applications. The problem of placing VNF chains typically consists of jointly determining:

  • suitable VNF placements over the available infrastructure, so to guarantee fulfillment of all IoT (i.e., sensors and actuators), hardware and security requirements of each virtual service function, and

  • suitable routing of traffic flows from one function to the other, so to satisfy end-to-end bandwidth and latency constraints.

Both decisions concern what is known as VNF placement [addis2015virtual], or VNF chaining and embedding [alhussein2018joint], and the corresponding problems are NP-hard [sun2016forecast]. In order to solve these problems in Edge computing scenarios, new peculiarities of Edge infrastructures should be considered that were not taken into account by previous work targeting Cloud scenarios.

First and foremost, the edge of the Internet is characterised by the presence of resource-constrained (sometimes battery-powered) and heterogeneously capable devices, which communicate via wired and wireless communication technologies. This leads to a potentially high uncertainty and to variations in the available infrastructure for what concerns both the availability of resources (e.g., due to node workload variations) and the QoS of the communication links (e.g., due to traffic variations). To the best of our knowledge, most of the existing work on VNF placement in Cloud scenarios, as well as preliminary work targeting Edge computing, solved the problem only considering static infrastructure conditions [Cziva2018].

Second, as an extension to the Cloud, Edge computing will inherit from it many security threats, while including its peculiar ones [Farris2019]. On one hand, the number of security enforcement points will increase by allowing local processing of private data closer to the IoT sources. On the other hand, new infrastructures will have to face brand new threats for what concerns the physical vulnerability of devices. Indeed, VNF deployments to Edge infrastructures will include accessible (Edge or IoT) devices that may be easily hacked, stolen or broken by malicious users [ni2017securing]. For what concerns security aspects, to the best of our knowledge, no previous approach has been proposed that considers them when deciding on VNF placement in Cloud-Edge scenarios [Farris2019].

In this paper, we present EdgeUsher

, a simple, yet general, probabilistic declarative methodology and a (heuristic) backtracking strategy to model and solve the

VNF placement problem in Cloud-Edge computing scenarios. EdgeUsher has been prototyped by means of the probabilistic logic programming language ProbLog [problog15], and we will show the prototype at work on a lifelike motivating example. Our methodology and prototype inputs:

  • a description of one (or more) VNF chain(s) along with its (their) hardware, IoT, minimum bandwidth, maximum end-to-end latency and security requirements, and

  • a (probabilistic) description of the corresponding (hardware, IoT, bandwidth, latency and security) capabilities offered by the available infrastructure.

Based on those, EdgeUsher outputs a ranking of all eligible placements for the VNF chains and routing paths for the related traffic flows over the available Edge-Cloud infrastructure. The ranking considers how well a certain placement can satisfy the chain requirements as the infrastructure state (probabilistically) varies. EdgeUsher also allows easily specifying and accounting for placement constraints, such as affinity and anti-affinity requirements. Affinity consists in placing two or more functions in the same physical node, thus reducing communications costs between VNFs, while anti-affinity prevents two or more VNFs sharing the same resources [oechsner2015flexible].

The rest of this paper is organised as follows. Section 2 describes a motivating example of deployment of a VNF chain onto an edge infrastructure for video surveillance in a university campus. In Section 3 we describe the basic model and backtracking algorithm for solving the VNF placement problem. Section 4 shows how EdgeUsher can be used to solve the placement problem in the motivating example. Related work is discussed in Section 5, while Section 6 concludes the paper with a discussion of results and insights into future work.

2 Motivating example

In this section, we describe a lifelike example to better introduce the VNF placement problem and to highlight some of the related challenges. The example will be used throughout the paper to illustrate the main steps of the proposed approach. We consider a portion of the topology of the Edge computing infrastructure deployed at UC Davis, inspired from [ning2019green], and sketched in Fig. 1. It consists of heterogeneously capable edge nodes connected via either wireless or wired communication links. Such infrastructure is a Wireless-Optical Broadband Access Network (WOBAN) and consists of heterogeneously capable edge nodes.

Fig. 1: Example Edge infrastructure at UC Davis [ning2019green].

We assume that available edge nodes feature either , , or hardware units111For the sake of readability, we only consider generic hardware units. Extensions to account for other resource types are straightforward. and that they are subject to workload variations as per the distributions reported in Table I. For instance, nodes with 2 hardware units are totally free in of the cases, whilst they only have free hardware unit in the remaining . We also assume that different node types feature different security capabilities as reported in Fig. I, expressed in terms of a common vocabulary of edge computing security capabilities, as per the taxonomy of Fig. 2. Last, but not least, the nodes featuring GB of memory (viz., the Fire and Police and the Student Centre devices) connect the edge network to a Cloud data centre through the same ISP Node (not shown in the figure).

Analogously, we assume that network links have the bandwidth and latency profiles listed in Table II. For instance, on-campus wireless connections may be not available in of the cases, and feature 70 Mbps bandwidth and 15 ms latency in the remaining .

Node Type Hardware Profile Security Capabilities
20% - 2 80% - 1 authentication, anti-tampering, wireless security obfuscated storage
20% - 4 80% - 2 anti-tampering, authentication, IoT data encryption, firewall, public key cryptography, wireless security, encrypted storage
20% - 8 80% - 4 access control, access logs, authentication, IoT data encryption, firewall, host IDS, public key cryptography, wireless security, encrypted storage
20% - 16 80% - 8 access control, access logs, authentication, backup, resource monitoring, IoT data encryption, firewall, host IDS, public key cryptography, wireless security, encrypted storage
ISP node 20% - 64 80% - 32 All
Cloud 99.9% - All
TABLE I: Example node types.
Link Type QoS Profile
wireless (edge-edge) 98% - 70 Mbps, 15 ms 2% - 0 Mbps, ms
wired (edge-edge) 95% - 250 Mbps, 5 ms 5% - 150 Mbps, 10 ms
wired (edge-ISP) 80% - 1 Gbps, 10 ms 20% - 1 Gbps, 20 ms
Internet (ISP-cloud) 90% - 10 Gbps, 50 ms 10% - 10 Gbps, 80 ms
TABLE II: Example link QoS profiles.

We suppose that a new smart CCTV system has been installed at the Transportation and Parking Services building, and that it continuously captures video footage and streams it to a CCTV System Driver deployed to the edge node which is in physical proximity.

A VNF chain (Fig. 3) must be deployed to support the video surveillance IoT system with a running application. The chain application, when suitably deployed, permits detecting events of interest (e.g., unauthorized access, fire, anomalous behaviour) by analysing video streams and by promptly notifying an alarm system installed at the Fire and Police station on campus. We model such chain as a graph connecting a set of function services, which includes:

  • a Feature Extraction service function that applies image processing techniques to isolate potentially interesting video portions, and

  • a Lightweight Analytics service function that further processes such video portions by performing object recognition, by detecting anomalies or potentially dangerous situations, and by sending appropriate notifications to the Alarm Driver deployed at the Police Station.

EDGE SECURITY

Virtualisation

Communications

Data

Physical

Other

Access Logs

Authentication

Host IDS

Process Isolation

Permission Model

Resource Usage Monitoring

Restore Points

User Data Isolation

Certificates

Firewall

IoT Data Encryption

Node Isolation Mechanims

Network IDS

Public Key Cryptography

Wireless Security

Backup

Encrypted Storage

Obfuscated Storage

Access Control

Anti-tampering Capabilities

Audit
Fig. 2: Security capabilities for Edge computing as in [fortisecure].
Fig. 3: Example VNF chain.

To work as expected, the end-to-end latency from the CCTV System Driver to the Alarm Driver must not exceed ms latency, as shown in Fig. 3. Additionally, for each link between two VNFs a minimum bandwidth requirement is specified, as shown in the figure. The traffic originated by the CCTV system is also collected by a WAN Optimizer service function that improves video data delivery efficiency (e.g., compression) and forwards video data to a Storage service. Complex video analytics are then performed with more relaxed latency constraints by a Video Analytics service function which updates, when needed, the model used by the system to recognise potentially dangerous events. Table III lists the requirements for the deployment of each VF in terms of hardware units, connection to IoT devices (sensors or actuators) and security policies, along with the expected processing time of each chain function. As a further (soft) requirement, VNF chain deployers at UC Davis would prefer the Video Analytics and Storage service functions be placed on the same node (affinity) to reduce communication costs.

Service Function Hardware Requirements Security Requirements Processing Time (ms)
CCTV System Driver 1 anti-tampering access control 2
Feature Extraction 3 access control ( obfuscated storage encrypted storage ) 5
Lightweight Analytics 5 access control host IDS ( obfuscated storage encrypted storage ) 10
Alarm Driver 0.5 access control host IDS 2
Wan Optimiser 10 public key cryptography firewall host IDS 5
Storage 50 backup public key cryptography 10
Video Analytics 16 resource monitoring ( obfuscated storage encrypted storage ) 40
TABLE III: Example VNF requirements and processing times.

Overall, the surveillance VNF chain is conceived as a graph of connected VNFs with specific hardware, IoT, security, network QoS requirements, and deployers’ desiderata. Deploying the described chain to the infrastructure available at UC Davis implies solving the VNF placement problem, i.e. deciding how to map a VNF graph on top of an infrastructure substrate made of heterogeneous Edge and Cloud nodes and communication links, so that hardware, IoT and end-to-end network QoS requirements are all satisfied.

Furthermore, the infrastructure is a dynamic environment and we assume it being subject to node workload variations and changing network conditions as per the probability distributions (possibly obtained from historical monitoring data [forti2018mimicking]) we described in this section. Such changes can indeed affect deployment performance and turn momentarily optimal solutions into bad or unfeasible ones, potentially leading to unsatisfactory application QoS, and application downtime or unavailability.

As we will show in the next sections, our EdgeUsher methodology permits determining VNF placement (i.e., function mappings and flow routes) and evaluating their performance against probabilistic infrastructure variations. The motivating example we have described so far will be retaken in Sect. 4 to show our prototype at work. After discussing solutions to this first VNF placement, we will illustrate how the deployers at UC Davis can exploit our methodology to determine a further placement for the dashed part of the chain in Fig. 3, handling a videostream for a second CCTV system deployed at the Mann Lab and joining the first chain at the WAN Optimiser service function.

3 EdgeUsher Methodology

In this section, after a brief overview of the ProbLog language (Sect. 3.1), we describe the entire declarative implementation222Prototype and examples code is also available at: https://github.com/di-unipi-socc/EdgeUsher of EdgeUsher. Particularly, we go through the model (Sect. 3.2) and (heuristic) backtracking algorithm (Sect. 3.3) that we use for determining solutions to the VNF placement problem (viz., VNF placement(s) and routing(s) of traffic flows) in a QoS-aware manner, evaluating placements against possible device churn, node workload and traffic changes. Then, we describe the main queries (Sect. 3.4) that permit using our EdgeUsher to determine eligible VNF placements.

3.1 Background: The ProbLog Language

As our model and methodology follow a declarative approach based on probabilistic reasoning about Edge infrastructure capabilities and VNF chain requirements, it was natural to prototype them by relying on probabilistic logic programming.

Probabilistic logic programming extends logic programming by enabling the representation of uncertain information. More specifically, logic allows representing relations among entities, while probability theory can model uncertainty over attributes and relations

[riguzzi2018foundations]. To implement both the model and the matching strategy we used the ProbLog language [problog15], a probabilistic extension of Prolog available as a Python package.

Prolog programs are finite sets of rules of the form

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] a :- b1, … , bn.

stating that a holds when b1 bn holds, where n0 and a, b1, …, bn are atomic literals. Rules with empty conditions (n0) are also called facts.

ProbLog programs are logic programs in which some of the facts are annotated with probabilities. A ProbLog fact, such as

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] p::a.

states that a holds with probability p. Non-annotated facts are assumed to hold with probability .

Problog also allows to use semicolons to express OR conditions in rules. For instance

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] a :- b1; … ; bn.

states that a holds when b1 bn holds.

Finally, annotated disjunctions, like

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] p1::a1; p2::a2; …; pK::aK.

state that at most one of the facts a1, …, aK holds with the associated probability333If , ProbLog assumes the presence of an implicit null choice which states with probability that none of the K options holds..

Each ProbLog program defines a probability distribution over logic programs where a fact p::a. is considered true with probability p and false with probability p. The ProbLog engine [problog15] determines the success probability of a query q as the probability that q has a proof, given the distribution over logic programs.

3.2 Model

3.2.1 VNF Chains

1% chain(ChainID, ServiceFunctionIDs).
2chain(ucdavis_cctv,
3      [cctv_driver,feature_extr,lw_analytics,alarm_driver,wan_optimiser,storage,video_analytics]).
4
5% service(F, TProc, HWReqs, IoTReqs, SecReqs).
6service(cctv_driver, 2, 1,[video1],or(anti_tampering,access_control)).
7service(feature_extr, 5, 3,[],and(access_control,or(obfuscated_storage,encrypted_storage))).
8service(lw_analytics, 10, 5,[],and(access_control,and(host_IDS,or(obfuscated_storage,encrypted_storage)))).
9service(alarm_driver, 2, 0.5,[alarm1],[access_control, host_IDS]).
10service(wan_optimiser, 5, 10,[],[pki, firewall, host_IDS]).
11service(storage, 10, 50,[],[backup, pki]).
12service(video_analytics, 40, 16,[],and(resource_monitoring,or(obfuscated_storage,encrypted_storage))).
13
14% flow(F1, F2, Bandwidth).
15flow(cctv_driver, feature_extr, 15).
16flow(feature_extr, lw_analytics, 8).
17flow(lw_analytics, alarm_driver, 1).
18flow(feature_extr, wan_optimiser, 15).
19flow(wan_optimiser, storage, 10).
20flow(storage, video_analytics, 10).
21
22% maxlatency([F1, F2, ..., FN], MaxLatency).
23maxLatency([cctv_driver, feature_extr, lw_analytics, alarm_driver], 150).
Fig. 4: Example chain declaration.

EdgeUsher permits to easily specify chains of virtual network service functions and their hardware, IoT, network QoS and security requirements. Indeed, a VNF chain – identified by a ChainID and composed of a list of ServiceFunctionIDs – can be declared as

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] chain(ChainID, ServiceFunctionIDs).

The requirements of each service function F composing the chain can be declared as in

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] service(F, TProc, HWReqs, IoTReqs, SecReqs).

where TProc is the average processing time (expressed in ms) of F, HWReqs is the hardware capacity required to deploy F, IoTReqs is the list of the IoT devices required by F, and SecReqs are the security policies for F. A security policy is either a list or an AND/OR composition of security properties.

Requirements on network bandwidth can be specified by declaring (directed) flows among pairs of service functions (F1, F2) and the Bandwidth (in Mbps) they need to communicate properly:

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] flow(F1, F2, Bandwidth).

Finally, constraints on maximum tolerated latency for (directed) service paths crossing the functions F1 F2 FN can be specified as

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] maxLatency([F1, F2, …,FN], MaxLatency).

As an example, an instance of the chain in Fig. 3, which insists on a single CCTV system (video1) and on a single alarm system (alarm1), can be declared as shown in Fig. 4.

3.2.2 Infrastructures

EdgeUsher permits to easily specify an infrastructure and its probabilistic dynamics. An infrastructure is modelled as a graph composed by nodes and links. A (Cloud or Edge) node identified by a certain NodeId can be declared as

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] node(NodeId, HWCaps, IoTCaps, SecCaps).

where HWCaps is the available hardware capacity of that node, IoTCaps is the list of IoT devices that the node reaches out directly, and SecProps is the list of the security capabilities it features.

On the other hand, a (point-to-point or end-to-end) link444EdgeUsher also permits specifying asymmetric links, for which upload and download QoS differ (e.g., like xDSL or 3/4G). connecting NodeA to NodeB which is available in the considered infrastructure can be declared as

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] link(NodeA, NodeB, Latency, Bandwidth).

where Latency is the latency experienced over the link (in ms) and Bandwidth is the transmission capacity it offers (in Mbps).

EdgeUsher also permits to naturally specify probabilistic profiles of both nodes and links by exploiting ProbLog annotated disjunctions. Fig. 5 shows an example of declaring a node of the infrastructure in Fig. 1, as per the hardware profiles reported in Table I.

1% node(NodeId, Hardware, IoT, SecProps).
20.2::node(parkingServices, 2, [video1],
3     [authentication, anti_tampering, wireless_security, obfuscated_storage]);
40.8::node(parkingServices, 1, [video1],
5     [authentication, anti_tampering, wireless_security, obfuscated_storage]).
Fig. 5: Example node and link declaration.

3.3 Algorithms

3.3.1 Overview

EdgeUsher relies on ProbLog built-in backtracking mechanism to determine VNF placement(s) and traffic routing(s) for a VNF chain to be suitably deployed to an Edge infrastructure. Not only does the algorithm check that all the (hardware, IoT, security) requirements of the VNF chain will be satisfied, also it exploits the probabilistic description of the infrastructure to predict how well the available nodes and links will support those requirements.

Fig. 6 offers555Variable terms start with upper-case , constant terms with lower-case. an overview of the generate & test algorithm exploited by EdgeUsher. The prototype first generates an eligible servicePlacement of all the Services in the VNF Chain (line 3), meeting all the hardware, IoT, and security requirements. Then, given the service Placement determined, it looks for an eligible flowPlacement by determining the ServiceRoutes to route all chain flows (line 4), meeting all the bandwidth and latency requirements.

1placement(Chain, Placement, ServiceRoutes) :-
2    chain(Chain, Services),
3    servicePlacement(Services, Placement),
4    flowPlacement(Placement, ServiceRoutes).
Fig. 6: Overview of EdgeUsher backtracking.

3.3.2 Function Placement

The service function placement step non-deterministically generates all eligible service function placements. Indeed, servicePlacement(Services, Placement) inputs the list of Services in the chain and returns an eligible Placement of them to the available Edge infrastructure. As shown in Fig. 7, an eligible placement of function service S to a node N in the Edge infrastructure is such that:

  • the overall hardware capabilities at N can support at least the considered service S (line 8),

  • the IoT devices directly connected to N include all those required by S (lines 9, 14),

  • the security properties featured by N satisfy the security policies declared for S (line 10, 16–19), and

  • all other services currently placed onto N leave enough free hardware resources for S to be deployed to N as well (line 11, 21–25).

1servicePlacement(Services, Placement) :-
2    servicePlacement(Services, Placement, []).
3
4servicePlacement([], [], _).
5servicePlacement([S|Ss], [on(S,N)|P], AllocatedHW) :-
6    service(S, _, HW_Reqs, Thing_Reqs, Sec_Reqs),
7    node(N, HW_Caps, Thing_Caps, Sec_Caps),
8    HW_Reqs =< HW_Caps,
9    thingReqsOK(Thing_Reqs, Thing_Caps),
10    secReqsOK(Sec_Reqs, Sec_Caps),
11    hwReqsOK(HW_Reqs, HW_Caps, N, AllocatedHW, NewAllocatedHW),
12    servicePlacement(Ss, P, NewAllocatedHW).
13
14thingReqsOK(T_Reqs, T_Caps) :- subset(T_Reqs, T_Caps).
15
16secReqsOK([SR|SRs], Sec_Caps) :- subset([SR|SRs], Sec_Caps).
17secReqsOK(and(P1,P2), Sec_Caps) :- secReqsOK(P1, Sec_Caps), secReqsOK(P2, Sec_Caps).
18secReqsOK(or(P1,P2), Sec_Caps) :- secReqsOK(P1, Sec_Caps); secReqsOK(P2, Sec_Caps).
19secReqsOK(P, Sec_Caps) :- atom(P), member(P, Sec_Caps).
20
21hwReqsOK(HW_Reqs, _, N, [], [(N,HW_Reqs)]).
22hwReqsOK(HW_Reqs, HW_Caps, N, [(N,A)|As], [(N,NewA)|As]) :-
23    HW_Reqs + A =< HW_Caps, NewA is A + HW_Reqs.
24hwReqsOK(HW_Reqs, HW_Caps, N, [(N1,A1)|As], [(N1,A1)|NewAs]) :-
25    N \== N1, hwReqsOK(HW_Reqs, HW_Caps, N, As, NewAs).
Fig. 7: Service function placement in EdgeUsher.

The check on security policies (lines 16–19) relies on ProbLog pattern-matching. If the policy is a non-empty list of mandatory requirements, then EdgeUsher checks if they are all available at the considered node (line 16), otherwise it checks the AND-OR formula describing the policy (lines 17–19).

The check (and update) on cumulative hardware usage performed by hwReqsOK(HW_Reqs, HW_Caps, N, AllocatedHW, NewAllocatedHW) (lines 21–25) accumulates in a list of elements of the form (N,A) the amount of hardware A consumed by all services placed to the same node N so far. Such list is scanned (line 24–25) whenever an update is being considered. If the node N is being used for the first time in the current placement, it is simply added to the list as (N, HW_Reqs) (line 21). Otherwise, if it is already in the list, it is updated by summing the newly required amount of hardware resources HW_Reqs to the one already allocated to other services A (line 22–23). In both cases, hwReqsOK checks that the non-allocated hardware can support the considered requirement.

The time complexity of servicePlacement is clearly , where is the number of nodes in the available infrastructure and is the number of services in the input service chain. Indeed, in the worst case scenario, each service S can be placed onto any node N of the infrastructure.

3.3.3 Flow Placement

Given a Placement output by the service placement step, flowPlacement(Placement, ServiceRoutes) determines eligible routes for the chain traffic flows among function services, as shown in Fig. 8. It first checks bandwidth requirements (line 3) and, afterwards, latency requirements (line 4–5).

First, a routing satisfying bandwidth constraints is determined by predicate flowPlacement (line 3) which holds if:

  • the services S1 and S2 in between which a flow is established have been placed onto the same node N (lines 8–10), or

  • the services S1 and S2 in between which the flow is established have been placed onto different nodes N1 and N2 and there exists a path in between those nodes that supports the bandwidth requirement of the flow (lines 11–15).

1flowPlacement(Placement, ServiceRoutes) :-
2    findall(flow(S1, S2, Br), flow(S1, S2, Br), ServiceFlows),
3    flowPlacement(ServiceFlows, Placement, [], ServiceRoutes, [], S2S_latencies),
4    maxLatency(LChain, RequiredLatency),
5    latencyOK(LChain, RequiredLatency, S2S_latencies).
6
7flowPlacement([], _, SRs, SRs, Lats, Lats).
8flowPlacement([flow(S1, S2, _)|SFs], P, SRs, NewSRs, Lats, NewLats) :-
9    subset([on(S1,N), on(S2,N)], P),
10    flowPlacement(SFs, P, SRs, NewSRs, [(S1,S2,0)|Lats], NewLats).
11flowPlacement([flow(S1, S2, Br)|SFs], P, SRs, NewSRs, Lats, NewLats) :-
12    subset([on(S1,N1), on(S2,N2)], P), N1 \== N2,
13    path(N1, N2, 2, [], Path, 0, Lat),
14    update(Path, Br, S1, S2, SRs, SR2s),
15    flowPlacement(SFs, P, SR2s, NewSRs, [(S1,S2,Lat)|Lats], NewLats).
16
17path(N1, N2, Radius, Path, [(N1, N2, Bf)|Path], Lat, NewLat) :-
18    Radius > 0, link(N1, N2, Lf, Bf), NewLat is Lat + Lf.
19path(N1, N2, Radius, Path, NewPath, Lat, NewLat) :-
20    Radius > 0, link(N1, N12, Lf, Bf), N12 \== N2, \+ member((N12,_,_,_), Path),
21    NewRadius is Radius-1, Lat2 is Lat + Lf,
22    path(N12, N2, NewRadius, [(N1, N12, Bf)|Path], NewPath, Lat2, NewLat).
23
24update([],_,_,_,SRs,SRs).
25update([(N1, N2, Bf)|Path], Br, S1, S2, SRs, NewSRs) :-
26    updateOne((N1, N2, Bf), Br, S1, S2, SRs, SR2s),
27    update(Path, Br, S1, S2, SR2s, NewSRs).
28
29updateOne((N1, N2, Bf), Br, S1, S2, [], [(N1, N2, Br, [(S1,S2)])]) :-
30    Br =< Bf.
31updateOne((N1, N2, Bf), Br, S1, S2, [(N1, N2, Bass, S2Ss)|SR], [(N1, N2, NewBa, [(S1,S2)|S2Ss])|SR]) :-
32    Br =< Bf-Bass, NewBa is Br+Bass.
33updateOne((N1, N2, Bf), Br, S1, S2, [(X, Y, Bass, S2Ss)|SR], [(X, Y, Bass, S2Ss)|NewSR]) :-
34    (N1 \== X; N2 \== Y),
35    updateOne((N1, N2, Bf), Br, S1, S2, SR, NewSR).
36
37latencyOK(LChain, RequiredLatency, S2S_latencies) :-
38    chainLatency(LChain, S2S_latencies, 0, ChainLatency),
39    ChainLatency =< RequiredLatency.
40
41chainLatency([S], _, Latency, NewLatency) :-
42    service(S, S_Service_Time, _, _, _),
43    NewLatency is Latency + S_Service_Time.
44chainLatency([S1,S2|LChain], S2S_latencies, Latency, NewLatency) :-
45    member((S1,S2,Lf), S2S_latencies),
46    service(S1, S1_Service_Time, _, _, _),
47    Latency2 is Latency+S1_Service_Time+Lf,
48    chainLatency([S2|LChain], S2S_latencies, Latency2, NewLatency).
Fig. 8: Flow routing in EdgeUsher.

The path(N1, N2, Radius, [], Path, 0, NewLat) predicate determines an acyclic Path of length at most Radius (i.e., maximum hop number) in between N1 and N2, which features latency Lat (line 14). A path is either a direct infrastructure link between N1 and N2 (lines 18–21), or a route of links that connect them (lines 22–27).

After a path is found, update checks if the bandwidth requirements of the considered flow can be supported by such path. Similarly to hardware allocation, a list of elements of the form (N1, N2, Bf) is maintained to keep track of the bandwidth Bf allocated on each link along a certain path and to check whether more flows mapped onto the same link do not exceed its capacity. Particularly, updateOne scans the list of links along a path and checks such requirements by accumulating the bandwidth consumed by all flows routed onto the same link.

Finally, latencyOK holds if the chain latency (which is computed by summing the functions service times of the traversed functions with the latency of the chosen path (lines 46–53) is less than or equal to the one required by the specified maxLatency requirement.

The worst-case time complexity of this step is , where is the average out-degree of nodes in the infrastructure, and in any case it is bounded by (i.e., by the upper-bound on the number of direct links in the infrastructure).

3.3.4 Heuristics

As per the considerations we have made before, the algorithmic time complexity of the approach described until now is exponential. Indeed, the combination of the service function placement step with the flow placement step results in a time complexity of , assuming . Naturally, such complexity becomes unbearable for very large infrastructures and for long service chains. Hence, we extended the prototype with a heuristic based on the probabilistic modelling we gave for the infrastructure.

EdgeUsher allows users to specify two threshold values that are used to prune the search space whenever the probability of satisfying the chain hardware or QoS requirements, respectively, falls below them. Such pruning is implemented via the ProbLog subquerying system which is used to evaluate the probabilities of the servicePlacement (Fig. 6, line 3) and the path (Fig. 8 line 14) goals during the search for eligible placements, and to check them against the user-specified thresholds.

3.4 Queries

After specifying an input chain and an Edge infrastructure, EdgeUsher can be used to determine all eligible service function placements and flow routings by simply issuing the query:

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] query(placement(Chain, Placement, Routes)).

Output results will be of the form:

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] placement(chainId, [on(f1,n1), on(f2,n2), …, on(fk,nk)], [(n1, n2, usedBw, [(f1, f2), …]), …]).

where the on(,) constructor associates a service function to its deployment node, whilst each (n1, n2, usedBw, [(f1, f2), ...]) keeps track on the bandwidth consumption and of all the service-to-service flows mapped onto the link between nodes n1 and n2.

It is worth noting that EdgeUsher allows users to easily specify function affinity or anti-affinity requirements. In the first case, the user can force the mapping of two (or more) functions to the same node, as for instance in the query

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] query(placement(Chain, [on(F1,N1), on(F2,N2), on(F3,N2)], Routes)).

stating that F2 and F3 must be mapped on a same node N2. Analogously, anti-affinity constraints can be specified by queries of the form:

[fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] query(placement(Chain, [on(F1,N1), on(F2,N2), on(F3,N3)], Routes), N2 =̄ N3).

stating that F2 and F2 must be mapped on two different nodes N2 and N3.

Additionally, users can specify partial deployments and/or routes, and use EdgeUsher to complete them. This is useful to quickly determine on-demand re-configurations of a chain in case of infrastructure failures or malfunctioning (e.g., crash of a node currently supporting a function service). Also, she can run EdgeUsher over complete deployments and/or routes – e.g. among those already enacted or obtained via other tools – so to instantaneously assess them against varying infrastructure conditions.

All the described functionalities work also with the heuristic version of the prototype, which can be queried as [fontfamily=courier, fontsize=, frame=single, framesep=1mm, framerule=0.1pt, rulecolor=] placement(Chain, Placement, Routes, ThrHW, ThrQoS). by specifying two threshold values, ThrHW and ThrQoS, that are used to cut the search space whenever the probability of satisfying the chain hardware or QoS requirements, respectively, falls below them.

4 EdgeUsher at Work

Fig. 9: Best placement for the first chain.
(a)
(b)
Fig. 12: Alternative placements for the second chain.

In this section, we illustrate our prototype at work over the motivating example of Sect. 2, whilst discussing the prototype performance and the effectiveness of the proposed heuristics. The experiments were run on a commodity laptop provided with an Intel Core i5-6200U CPU (2.30GHz) and 8GB of RAM, running Ubuntu 18.04.2 LTS, ProbLog 2.1.0.36 and Python 3.6.

We started by looking for eligible placements of the VNF chain supporting the CCTV system installed at the Parking Services building. For the purpose of the experiments, we first run the non-heuristic EdgeUsher over three different inputs for the infrastructure description:

  • a static description that only considers the most probable values of each probability distribution (i.e., for both node hardware and link QoS profiles),

  • a partial description of the infrastructure that accounts only for the highest probability value of each distribution, and

  • a complete probabilistic description of the infrastructure that includes all probability distributions.

Table IV shows the obtained results in terms of number of generated eligible placements and computation time666Timings obtained by averaging results over executions for each case. needed to obtain those. It is worth noting that the prototype runs fairly fast on the static777The non-heuristic prototype can be run also in traditional Prolog environments to determine eligible placements in static infrastructure conditions. In this case, a first placement solution is returned istantaneously. and on the partial infrastructure, which do not suffer from the additional combinatorial complexity that ProbLog incurs in, when evaluating (the probability distributions expressed as) annotated disjunctions in the complete infrastructure description. We will use the results obtained by the non-heuristic prototype over the complete probabilistic description of the Edge infrastructure at UC Davis as a baseline to evaluate the performance of the heuristic prototype.

#Placements Time
Static 102 8 s
Partial 102 8 s
Complete 4296 2:48 h
TABLE IV: Results with EdgeUsher without heuristics.
#Placements Time
T=0.8 6 47 s
T=0.7 56 56 s
T=0.6 102 1:07 min
T=0.5 102 1:07 min
T=0.4 102 1:13 min
T=0.3 102 1:22 min
T=0.2 102 1:37 min
T=0.1 1056 11:17 min
TABLE V: Results with EdgeUsher with heuristics.

Thus, we run the heuristic EdgeUsher over the complete description of the edge infrastructure at UC Davis. For the sake of simplicity, we set both thresholds (i.e., the one on hardware requirements ThrHW and the one on network QoS ThrQoS) to a value that was varied during the experiments in the range with a step of . Table V shows the obtained results in terms of number of generated eligible placements and execution times needed to obtain those.

The results show that the employed heuristics considerably reduces the search space and, thus, the execution time needed to determine eligible VNF placements and routing. Particularly, the full probabilistic description of the UC Davis infrastructure can be handled in a time which shows a speed-up between 15 and 214 with respect to the exhaustive prototype, and still returns a subset of the optimal results. Particularly, with thresholds set to , EdgeUsher determines eligible placements for the VNF chain supporting the CCTV system. Four of such placement solutions have a probability of meeting all set requirements of , whilst the remaining two of . All output solutions fall within the best solutions generated also by the non-heuristic prototype, when it is run over the complete infrastructure description.

We then included an affinity constraint between the Storage and the Video Analytics service functions, and we run the prototype again with . As a result, execution time halved with respect to the execution without such constraint, reaching around 24 seconds. Fig. 9 shows one of the best placements obtained when forcing the affinity constraint, which features probability of complying to all hardware, IoT, security, bandwidth and end-to-end latency requirements of the input chain. In this case, both the Storage and the Video Analytics are deployed to the available Cloud node. The highlighted links – along with their labels – show the routing path associated to the placement. Such piece of information could be actually used to instruct the network (e.g., via SDN controllers) so to allocate a suitable amount of bandwidth to each flow.

Afterwards, assuming the first chain was deployed as in Fig. 9, EdgeUsher was exploited to check whether it was possible to extend the deployment by placing anew the dashed part of the chain for a second CCTV system installed at the Mann Lab. By querying again the heuristic prototype, new possible VNF placements were obtained in around s. All output solutions featured a probability of meeting all chain constraints. Four of such solutions placed services as sketched in Fig. 12 (a), the remaining three as in Fig. 12 (b), other routings – which are not shown – were possible. It is worth noting that the deployers might consider using one of the output solutions and keep some of the others as possible backups to guarantee chain functioning in case of device failures or overloading, or in case of network congestion.

5 Related Work

Several works are investigating the potential benefits of adopting SDN and NFV technologies in IoT. An SDN and NFV architecture for IoT network and application management is proposed in [ojo2016sdn]. Morabito and Beijar [Morabito:2017] propose an architecture and a prototype implementation of an NFV/SDN framework enabling automated and dynamic network service chaining across Edge (i.e., IoT gateway) and Cloud (i.e., central data center) domains. SDN and NFV are jointly used in [fi9010008] to assure service continuity of a video monitoring application deployed over a flying ad-hoc network (FANET) built on a fleet of drones over rural areas. Drones are NFVI Point of Presence that can host Virtual Network or Application Functions.

The problem of placing VNFs on a physical substrate for realising service chains in a hybrid edge/cloud infrastructure to support IoT applications has only recently emerged. Previous work has focused on network service placement in VNF infrastructure, considering intra and/or interDC networks (e.g., [pham2017traffic] and [LUIZELLI2017]). A survey on resource allocation strategies for the network services deployment in VNF infrastructures is provided in [VNF_Herrera].

Although converged approaches are emerging for managing NFV, edge and fog computing services [commag_conv2017], traditional VNF placement approaches do not tackle challenges brought by Fog and Edge computing for IoT applications. These challenges include heterogeneity of computing nodes, dynamic changes of network and node conditions that may turn optimal or quasi-optimal solutions into unfeasible ones, and security threats, just to mention the main ones. Recent work in the area of application placement in the Fog have partially begun to tackle these aspects, but open research problems still exist, such as placement approaches accounting for security aspects and dynamic infrastructure variations, as discussed in the review by Brogi et al. [DBLP:journals/corr/abs-1901-05717].

Only few works have addressed the problem of placing VNFs in a hybrid environment made of edge and cloud computing nodes. Leivadas et al. [Leivadeas2019] model the problem of SFC placement in hybrid MEC and cloud environment considering location requirements posed by VNFs and targeting minimisation of deployment costs and delays. They propose a Mixed Integer Programming (MIP) formulation of the problem and a sub-optimal approach based on the Tabu Search meta-heuristic.

SFC placement in IoT scenarios, which demand for low-latency response, high-throughput processing and cost effective resource usage, is tackled in [Wang:IoT:2018]

. The work proposes a linear programming model and an approximation optimization algorithm to achieve deadline and packet rate guarantees while avoiding resource idleness. However, SFC orchestration is done within the cloud domain and the availability of computing resources at the edge is not considered.

Yala et al. [Yala2018]

propose a VNF placement algorithm that optimises access latency and service availability in a mixed edge and cloud environment for ultra-Reliable Low-Latency Communications (uRLLC) services. The multi-objective optimization problem is solved by exploiting a Genetic Algorithm metaheuristic, whose achieved performance is compared against an exact algorithm implemented in CPLEX. Although a network service is defined as a set of VNFs, chaining constraints are not considered.

Mouradian et al. [mouradian2018application] tackle application component placement in NFV-based hybrid Cloud-Edge systems and propose an ILP formulation that represents applications as non-deterministic VNF Forwarding Graphs. Graphs can be built using sequence, parallel, selection and loop substructures and probabilities are used to model selection and loop iterations.

Although all the above mentioned works [Leivadeas2019, Wang:IoT:2018, Yala2018, mouradian2018application] consider latency requirements (either as minimisation objective or as constraint), none of them accounts for dynamic variations of network status, which instead can influence the extent to which QoS requirements are satisfied in the long run. Neither security aspects are taken into account.

Typically, VNF placement approaches that consider dynamic network conditions either recompute placement and enforce scaling and/or migration actions [Cziva2018] or try to find a solution that is robust against network status variations [Cheng_JSAC2018]. Cziva et al. [Cziva2018] formulate the problem of edge VNF placement as an ILP to derive latency-optimal deployments of VNFs (they do not consider chaining constraints). They also define a dynamic scheduler that recomputes placement to account for latency variations on links. This scheduling problem, which consists in selecting the time for placement recalculation so that unnecessary VNF migrations are prevented and latency violations are bound, is solved using optimal stopping theory. However, chaining constraints are not modelled. In [Cheng_JSAC2018] network dynamics are taken into account to find temporal robust placement solutions. The SFC placement is formulated as a Stochastic Resource Allocation problem that exploits both currently observed network information and future variation. However, the work does not tackle latency-aware placement and the network model does not represent variations of neither network latency nor node resources, as our work does. Zhu and Huang [EdgePlace2018] formulate a stochastic programming problem that minimizes the placement cost and aims at achieving high availability application deployments. The problem formulation thus accounts for probabilities of VM, host and link failures, but does not consider latency constraints.

As analysed in [Farris2019], IoT environments introduce challenging security threats, ranging from attacks to IoT devices, attacks in IoT-oriented clouds and networks to threats in the application layer, such as vulnerabilities in software, data leakage and phishing. Risks exist in executing VNFs over third-party infrastructures and security and trust criteria have to inform placement decisions [Farris2019]. Indeed, the need to consider security issues in virtual network embedding (VNE) and VNF placement problems is gaining increasing interest. A classification of security requirements into node, link and topological requirements to be considered in virtual network embedding problems is provided in [Fischer:2017]. In [dwiardhika2019virtual] the problem of virtual network embedding is formulated so account for standard protection provided by substrate nodes and links (quantitatively referred to as ”security level”). If the level of security is lower than the security demand, the VNE algorithm tries to place security VNFs (e.g., firewall, deep packet inspection, and intrusion detection) to improve the offered security level. Optimal placement of security SFCs is tackled in [sendi2018], where the placement problem is formulated including deployment constraints derived from network security patterns.

Table VI provides a comparative overview of the discussed related work and highlights how, to the best of our knowledge, this is the first work aiming at addressing VNF chain placement in a hybrid edge/cloud network with latency constraints while accounting for network status variations and security requirements. In addition, while most related work rely on a linear programming formulation, we adopt a probabilistic declarative approach. In addition, our prototype is released as open source software and the experiment data is also made publicly available.

angle=90,lap=0pt-(1em)Article angle=90,lap=0pt-(1em)Dynamicity angle=90,lap=0pt-(1em)Security angle=90,lap=0pt-(1em)E2E Latency angle=90,lap=0pt-(1em)Declarativity angle=90,lap=0pt-(1em)Prototype angle=90,lap=0pt-(1em)Edge-Cloud angle=90,lap=0pt-(1em)IoT Devices
[Leivadeas2019]
[Wang:IoT:2018]
[Yala2018]
[mouradian2018application]
[Cziva2018]
[Cheng_JSAC2018]
[EdgePlace2018]
[dwiardhika2019virtual]
[sendi2018]
Our work
TABLE VI: Related work overview.

6 Conclusions and Future Work

In this paper we have proposed a probabilistic logic programming approach for solving the problem of placing VNF chains onto Edge-Cloud infrastructures. EdgeUsher returns the eligible deployments of VNF chains on a hybrid edge/cloud infrastructure that guarantee the fulfillment of a set of placement requirements, namely: hardware, IoT reachability, bandwidth, latency and security policies.

By leveraging ProbLog, EdgeUsher supports a declarative methodology for specifying and solving such QoS-aware VNF chain placement problem which is enhanced with the capability of accounting of infrastructure variations, supposing that probabilistic distributions of computing nodes and infrastructure status are available. Thanks to the declarative approach, additional constraints, such as affinity, anti-affinity or placement into a specific node can be easily expressed. Moreover, EdgeUsher implementation has been fully provided in this paper (around 80 lines of code shown in Fig. 6, 7 and 8) together with some example of usage.

The prototype returns all feasible deployments, ranked according to the computed probability that the provided solution satisfies the chain requirements with respect to infrastructure variations. In addition to an exhaustive search approach we proposed an heuristic based on probabilistic modelling that prunes solutions that do not satisfy user-defined thresholds for hardware and QoS requirements. It is worth noticing that EdgeUsher could also be used to evaluate deployments computed by alternative placement algorithms with respect to the infrastructure variability, by calculating its probability of satisfying hardware and QoS requirements.

In this work, we discussed the use of the tool taking a lifelike scenario as an example. This allowed us exemplifying the usage of the prototype and discussing experiment results with reference to such scenario.

For larger scenarios, we envision a hierarchical architecture of clusters of edge nodes (partitioned e.g. according to administration, application and/or geographical criteria) where orchestration features are run by one head node, and connected to a few Cloud nodes. We intend to elaborate further on this vision by running EdgeUsher over a domain made by a few clusters of edge nodes and their associated cloud nodes.

Future work will be also devoted to improve the evaluation of the proposed approach through extensive testing in a simulation environment. The setting up of a small scale testbed is under way in our campus network and it will be used to perform tests using probabilistic distributions derived from real monitoring data.

EdgeUsher also allows specifying security requirements in terms of logical expressions over security properties. It is worth noticing that some security properties can be provided exclusively as hardware capabilities (e.g., anti-tampering) while other ones could be implemented also as software and deployed as VNFs (e.g., firewall). This option could open up the possibility of adaptively inserting required security VNFs when needed, which we plan investigating in the near future.

References