One-Hop Out-of-Band Control Planes for Low-Power Multi-Hop Wireless Networks

12/17/2017 ∙ by Chaojie Gu, et al. ∙ 0

Separation of control and data planes (SCDP) is a desirable paradigm for low-power multi-hop wireless networks requiring high network performance and manageability. Existing SCDP networks generally adopt an in-band control plane scheme in that the control-plane messages are delivered by their data-plane networks. The physical coupling of the two planes may lead to undesirable consequences. To advance the network architecture design, we propose to leverage on the long-range communication capability of the increasingly available low-power wide-area network (LPWAN) radios to form one-hop out-of-band control planes. We choose LoRaWAN, an open, inexpensive, and ISM band based LPWAN radio to prototype our out-of-band control plane called LoRaCP. Several characteristics of LoRaWAN such as downlink-uplink asymmetry and primitive ALOHA media access control (MAC) present challenges to achieving reliability and efficiency. To address these challenges, we design a TDMA-based multi-channel MAC featuring an urgent channel and negative acknowledgment. On a testbed of 16 nodes, we demonstrate applying LoRaCP to physically separate the control-plane network of the Collection Tree Protocol (CTP) from its ZigBee-based data-plane network. Extensive experiments show that LoRaCP increases CTP's packet delivery ratio from 65 external interference, while consuming a per-node average radio power of 2.97mW only.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Billions of smart objects will be deployed, forming things networks that are interconnected by Internet of Things (IoT). Many of these networks will follow the multi-hop wireless paradigm. For instance, wireless meshes are increasingly adopted to interconnect surveillance cameras [1] and vehicles [2]. Wireless sensors have been widely deployed for sensing and control of building environment and energy use. Bluetooth low energy (BLE) will support mesh networking soon [3]. Wireless connectivity is also critical to the vision of Industry 4.0. Utility and manufacturing systems are increasingly adopting wireless metering and monitoring [4].

The main advantage of low-power multi-hop wireless networks (LMWNs) is that, during the deployment phase, a network can easily scale up to cover a large geographic area. A primary design principle for LMWNs is the use of distributed protocols (e.g., routing [5]), where each node independently performs various networking functions (e.g., data forwarding) based on local information. Thus, the control plane (i.e., determination of how to handle packets) and the data plane (i.e., carrying out control-plane decisions) of these distributed protocols are jointly implemented at each network node. However, as a well understood notion, a distributed scheme without the global view often yields suboptimal performance. Moreover, although the distributed scheme may work satisfactorily most of the time thanks to a decade of research, it is often complex, inelastic to change, and difficult to manage once the network is deployed.

To improve the network performance and manageability, some LMWNs, especially those deployed for mission-critical tasks, have adopted centralized network controls. For instance, WirelessHART, an LMWN standard that has been adopted in over 8,000 manufacturing systems [4], prescribes centralized routing control based on a global view of the network to better achieve certain performance objectives (e.g., firm/soft real-time packet delivery). Similarly, ISA100.11a, another industry-oriented LMWN standard, also adopts centralized routing control and network management. For the routing in these LMWNs, the control plane is separated from the data plane, in that the routing control is implemented at a centralized node whereas other network nodes follow the routing schedule to forward data packets. However, all these LMWNs adopt in-band control planes, i.e., the control-plane messages such as network status reports and routing schedules are delivered by the data-plane networks.

The physical coupling between the control and data planes in the in-band scheme may lead to undesirable consequences. The wireless data-plane network is susceptible to external interference. Deteriorated data-plane links may lead to delayed deliveries or even losses of the control-plane messages, making the network less responsive to data-plane link quality variations. Moreover, when the data plane loses key routing nodes (e.g., due to node hardware/software fault and depletion of battery) or the control plane makes wrong control decisions (e.g., due to design defects or erroneous human operations), the data-plane network may fall apart to disconnected partitions. As a result, restorative network control commands in the control plane may not be able to reach the destination nodes. Recent research has studied protecting the control plane from data-plane faults [6]. However, the solution has limited protection capability against a single link failure only [6].

In light of the in-band scheme’s pitfalls, we study an out-of-band scheme, where the control plane uses a dedicated network different from the data-plane network. The increasing availability of multiple network interfaces on IoT hardware platforms favors the implementation and adoption of the out-of-band scheme. The latest IoT platforms are generally equipped with multiple heterogeneous network interfaces: Raspberry Pi 3 supports Ethernet, Wi-Fi, and BLE; Firestorm [7] supports BLE and ZigBee; Arduino has various add-on boards to support different radios. To design the out-of-band control-plane network for LMWN, most high-speed radios (e.g., Wi-Fi and LTE) are unsuitable due to their high power consumption. ZigBee and the coming BLE mesh are also ill-suited, since otherwise the control-plane network will be yet another LMWN that suffers the same manageability and fragility issues as the data-plane network. Instead, we propose to use the emerging low-power wide-area network (LPWAN) technologies (e.g., LoRaWAN, SigFox, Weightless-P, and NB-IoT) for the out-of-band control plane. Owing to the kilometers communication range of LPWAN links, the LPWAN-based control plane can be a one-hop star network, greatly simplifying its deployment and management.

As the first study to our best knowledge on the feasibility of LMWN out-of-band control plane, we choose LoRaWAN to prototype our system and gain insights. This choice is due to its use of license-free ISM band, open data link standard, low cost (US$15 per unit [8]), and good scalability to support many IoT objects. In contrast, other LPWAN technologies are proprietary (SigFox and NB-IoT) or not widely available (Weightless-P). While the low-power long-range communication capability is the key advantage of LoRaWAN, we need to manage the following two limiting characteristics of LoRaWAN. First, a LoRaWAN downlink frame from the controller to a network node must be in response to a precedent uplink frame. Thus, the transmissions of network control commands initiated by the controllers may be postponed to the network node’s status reporting. Second, LoRaWAN supports uplink concurrency but no downlink concurrency. This downlink-uplink asymmetry impedes acknowledging each uplink frame, whereas the control plane generally desires reliable message delivery. In addition, a reliable media access control (MAC) approach is needed to replace LoRaWAN’s ALOHA MAC that may perform unsatisfactorily in traffic surges.

To address these issues, this paper presents the design and implementation of a prototype system called LoRaCP (long-range control plane). Based on our extensive measurements on LoRaWAN’s energy and latency profiles, we design LoRaCP-MAC, a TDMA-based multi-channel MAC protocol featuring uplink heartbeats, negative acknowledgment (NAK), and an ALOHA-based urgent channel, to manage the transmissions of the control-plane messages. The uplink heartbeats open downlink windows for controller-initiated network commands and maintain network nodes’ clock synchronization for TDMA. With NAK, the controller needs not acknowledge every uplink frame. The urgent channel complements the TDMA channels to mitigate the rigidness of TDMA. On a testbed of 16 nodes, we demonstrate applying LoRaCP to physically separate the control plane of the Collection Tree Protocol (CTP) [5] from its ZigBee-based data-plane network. Extensive experiments show that LoRaCP increases CTP’s packet delivery ratio from 65% to 80% in the presence of external interference, while consuming a per-node average radio power of only, much lower than the active power of many recent LMWN platforms’ microcontrollers (e.g., on Firestorm [7]).

The rest of the paper is organized as follows. §II reviews related work. §III presents examples to motivate the out-of-band scheme. §IV profiles LoRaWAN performance. §V and §VI design and evaluate LoRaCP, respectively. §VII concludes.

Ii Related Work

Existing studies that exploit multiple network interfaces can be broadly divided into two classes of bandwidth aggregation and separation of control and data planes (SCDP).

Bandwidth aggregation uses multiple network interfaces to transmit/receive data simultaneously to increase throughput. Habak et al. [9] surveyed early bandwidth aggregation literature. Recent development is reviewed briefly here. We divide them into two categories. The first category exploits homogeneous radios. FatVAP [10] enables a 802.11 wireless card to connect to multiple access points. FastForward [11] uses two 802.15.4 radios operating on different channels, with one receiving and the other forwarding data simultaneously. The second category exploits heterogeneous radios. MultiNets [12] deals with the switching between multiple network interfaces on mobile devices. In [13], Mu et al. optimize the selection of radios and their transmission powers. Recent studies [14, 15] characterize the performance and energy consumption of Multipath TCP through multiple radios of a mobile device. Different from bandwidth aggregation that combines multiple network interfaces in the data plane to increase throughput, SCDP aims to improve network optimality and manageability.

Software-defined networking (SDN), with SCDP as its core concept, is a growing momentum in data-intensive networks. SCDP can be naturally applied in WLANs and cellular networks, as their topologically centralized access points and base stations can run the control-plane logics for better resource allocation and mobile node handover [16]. However, there is limited research on SCDP in multi-hop wireless networks. An OpenFlow-enabled Wi-Fi mesh was built in [17], where each Wi-Fi card is split into two virtual interfaces with different SSIDs and the two planes are two multi-hop networks in their respective SSIDs. To the best of our knowledge, WASP [18] is the only system that implements out-of-band control plane for multi-hop wireless networks. WASP uses Wi-Fi Direct and LTE of smartphones to form the data and control planes, respectively. Different from WASP, we focus on low-power networks with a limited energy budget.

Iii Motivation

This section discusses the motivation of the out-of-band scheme. §III-A presents a simulation study to show the network performance gain by centralized network control. In §III-B, we discuss the challenges faced by the in-band scheme.

Iii-a Distributed vs. Centralized Network Control

In this section, we compare through simulations the network performance achieved by the Collection Tree Protocol (CTP) [5] and its centralized variant that we call CTP-SCDP. In §VI, we will use LoRaCP to implement CTP-SCDP and evaluate it on a testbed. In this paper, we use CTP as our case study network protocol, because it has an open implementation and is a standard component of the industry-class TinyOS Production operating system [19]. The results based on CTP will help understand the performance improvement by SCDP and showcase the use of LoRaCP to physically separate the control and data planes. We believe the understanding and LoRaCP are also applicable to many other LMWN protocols.

CTP aims to maintain a minimum-cost routing tree in the presence of dynamic link quality characterized by the expected transmission count (ETX). The cost of a route to the tree root is the sum of the ETXs of the links on the route. A node estimates the route cost using the residual ETX (RETX), which is given by , where is the ETX of the link between node and its parent node , and is node ’s RETX. CTP works in a distributed manner, in that each node selects its parent from the set of its neighbor nodes based on local information only. Specifically, , where is estimated based on the transmissions of beacons and data frames; is broadcast in node ’s beacons.

Fig. 1: Routing trees by CTP and CTP-SCDP. The solid thick gray links are shared by the CTP and CTP-SCDP trees; the dashed thick red links are on the CTP tree only; the dashed thin blue links are on the CTP-SCDP tree only.

In CTP, the information about the quality of a link propagates to the whole network during the beaconing process. However, this propagation takes time. Thus, when link quality changes over time, the RETX of any node cannot capture the latest ETXs of the links on its route to the root. In particular, the closer the links on the route are to the root, node ’s knowledge about the links (which is encompassed in ) is more out-of-date. As a result, CTP may not construct the minimum-cost tree in the presence of time-varying link quality. Differently, in CTP-SCDP, the latest ETXs are updated to the network controller and the optimal routing is sent to the nodes, both directly through the control-plane network.

We compare CTP and CTP-SCDP using the TinyOS simulator TOSSIM. We place 60 nodes randomly in a region as illustrated in Fig. 1. Link gains are generated according to the Euclidean distances between nodes using a tool in TOSSIM. Radios’ hardware noise floor is set to be (a mild noise level). To simulate CTP-SCDP, we add a node as the network controller, which has sufficiently large link gains with any other nodes, such that the control-plane network is a one-hop star network. In CTP-SCDP, node sends the latest to the network controller. Upon receiving an ETX update, the controller updates a directed graph with the ETXes as the edge costs and recomputes the minimum-cost routing tree using the Dijkstra’s algorithm. Then, the controller sends the new parent information to the nodes.

(a) Node 55
(b) All nodes
Fig. 2: CTP cannot build a minimum-cost tree.
(a) Node 55
(b) All nodes
Fig. 3: CTP-SCDP can build a minimum-cost tree.

We conduct two sets of simulations to show the benefits of SCDP. The first set shows the suboptimal performance of CTP. Specifically, we concurrently run CTP and CTP-SCDP, but the controller in CTP-SCDP does not send routing control commands to the nodes. Thus, the routing is managed by CTP only. We consider the following evaluation metrics:

  1. RETX of node estimated by CTP (denoted by ) and the sum of all RETXes (denoted by );

  2. The ground-truth RETX of the route determined by CTP for node (denoted by ), which can be measured as the sum of the latest ETXes of the links on the route obtained by the controller in CTP-SCDP, as well as the sum of all ground-truth RETXes (i.e., );

  3. The minimum RTEX of node computed by CTP-SCDP (denoted by ) and the sum .

We simulate a time duration of two hours, during which each node generates a data packet every eight seconds. Fig. 1 shows the routing trees computed by CTP and CTP-SCDP at the end of the simulation. They are different. Fig. 2 shows the evaluation metrics for Node 55 and all the nodes over the two hours. We can see that, compared with the ground truth (i.e., the solid black curves), CTP’s knowledge about the chosen routes (i.e., the dashed red curves) cannot capture many transient changes in the ground truth, because of the information propagation latency in the distributed network control. Compared with the global optimal (i.e., the blue dots), the routes chosen by CTP have higher costs.

In the second set of experiments, we run CTP-SCDP only. Fig. 3 shows the results. From the figure, we can see that the routes chosen by CTP-SCDP generally achieve the minimum costs. The above two sets of simulations show that centralized network control improves the network performance in dynamic network conditions. Thus, the centralized control enabled by SCDP is desirable for performance-critical networks such as those deployed for industrial applications [4].

Iii-B Challenges Faced by In-Band Control Planes

Our simulations in §III-A demonstrate the underperformance of distributed network control. As discussed in §I, some mission-critical LMWNs have adopted centralized network control to improve network performance and manageability. However, they follow the in-band control plane scheme due to the lack of multiple radios in the past decade or the concern of increased infrastructure cost in deploying additional radios. However, the physical coupling of the control and data planes generates various challenges. For instance, given the fragile nature of wireless, how to protect the in-band control plane against data-plane link failures is a challenging problem. Recent research has investigated this issue. However, existing solutions provide limited protection capability. For instance, the solution proposed in [6], though sophisticated, can handle a single link failure only. The in-band control plane protection under a general setting is still an open issue.

Given the complications of the in-band scheme and the resulted, unsolved challenges, in this paper, we study the alternative out-of-band scheme that is increasingly feasible in terms of hardware support, due to the prevalence of multiple radios on IoT platforms. In particular, LPWAN radios are becoming readily available and cheap (US$15 per unit [8]). Thus, we inquire in this paper basic system research questions including the feasibility and design of LPWAN-based control plane for LMWNs, as well as its performance under various settings. In the following sections, the design and evaluation of LoRaCP provide a baseline in answering these questions.

Iv LoRaWAN Performance Profiling

This section profiles LoRaWAN’s energy and latency, which are important to the design of LoRaWAN-based control planes.

Iv-a LoRaWAN and Its Characteristics

Iv-A1 Introduction of LoRaWAN

LoRaWAN (long-range wide area network) is an open data link layer specification based on LoRa, a PHY layer technique that uses a Chirp Spread Spectrum modulation and operates in sub-GHz ISM bands (e.g., EU and US). LoRa admits configuring the ratio between the symbol rate and chip rate by specifying an integer spreading factor (SF) within . Specifically, each symbol is modulated by chips. A higher SF increases the signal-to-noise ratio and the communication range, but decreases the symbol rate. In this paper, we use six SF settings, i.e., from SF7 to SF12. (SF6 is a special setting that is often not used.) The communications using different SFs are orthogonal and thus can be concurrent. Thus, in this paper, the terms SF and channel are used interchangeably.

A LoRaWAN network is formed by one or more gateways and many end devices. The gateway, often Internet-connected, can simultaneously handle the communications with multiple nodes in different channels. LoRaWAN defines three classes (A, B, and C) of end devices. A Class-A device’s uplink transmission is followed by two downlink windows (RX1 and RX2). Downlink communications to the node at any other time will have to wait until the next uplink from the node. As Class-A is the most power efficient and supported by any end device, we choose to design LoRaCP based on Class-A. The Class-B and C have not been widely supported. Designing LoRaCP based on them will decrease LoRaCP’s universality.

Iv-A2 Characteristics of LoRaWAN

The low-power long-range communication capability is the main advantage of LoRaWAN that makes it promising for control planes of LMWNs. However, we need to keep in mind the following two limiting characteristics of LoRaWAN in the design of LoRaCP.

Downlink-uplink asymmetry: LoRaWAN is mainly designed and optimized for uplinks from end devices to gateway. For instance, the LoRaWAN concentrator can receive frames from multiple channels simultaneously, whereas it can send a single downlink frame only at a time. Moreover, the Class-A specification requires that any downlink transmission must be unicast, in response to a precedent uplink transmission.

Lossy links: From existing tests [20], with SF12, the frame reception rate is about 80% at a distances of . To build a reliable control-plane network, the frame losses need to be dealt with properly. Acknowledging each uplink frame is wasteful given the scarce downlink time as discussed earlier.

By default, LoRaWAN uses ALOHA that may perform unsatisfactorily in surges of control plane messages. Thus, we need to design a new MAC to enable efficient LoRaCP. As LoRa does not prescribe carrier sense capability, CSMA is not viable. Time-division multiple access (TDMA) is often adopted for reliability that control planes desire. However, as shown in this paper, the implementation of TDMA on LoRaWAN is non-trivial. Moreover, a strict TDMA may result in undesirable delays in transmitting urgent messages.

In the design of LoRaCP (cf. §V), the downlink-uplink asymmetry and lossy links will be managed by the NAK mechanism. Moreover, we will design a TDMA-based multi-channel MAC with an urgent channel to replace the ALOHA. Although we face the above limiting characteristics of LoRaWAN, this work presents software solutions that can be implemented readily on various LMWN platforms that integrate LoRa radios. Our software-space design is much more cost effective and practical than clean slate LPWAN hardware designs for LMWN control planes.

Iv-B LoRaWAN Performance Profiling

Iv-B1 LoRaCP hardware prototypes

We conduct performance profiling based on the following prototype hardware platforms. Each end device integrates a Cooking Hacks LoRaWAN shield [21] and a Raspberry Pi (RPi) 3 Model B single-board computer. The shield has a Microchip RN2483 LoRaWAN chip, an antenna, and interfacing circuits. The shield can be controlled by the RPi using a C++ library from Cooking Hacks. The gateway integrates an RPi and an IMST iC880A LoRaWAN concentrator board [22]. The iC880A board can receive frames over all LoRa channels simultaneously.

A ZigBee-based Kmote is plugged into a USB port of the RPi of each end device, forming a LoRaCP node. The nodes use their ZigBee radios to form the data-plane network. From now on, the gateway is referred to as LoRaCP controller. The controller unnecessarily has a ZigBee radio, since it may not be in the data-plane network. We use RPi to quickly prototype the integration of LoRaWAN and ZigBee. The results of this paper will suggest that integrating LoRaWAN into the design of LMWN platforms, especially those desiring high network performance and manageability, is valuable. In such designs, the RPi will not be needed. Fig. 4 shows our prototypes.

Iv-B2 Energy profiling

RN2483’s datasheet says its run-time current supply is at . We use a Monsoon meter to measure the current supply of the whole LoRaWAN shield after properly jumping the power wires. Monsoon reads to under different SFs. This shows that the shield’s encapsulating and interfacing circuits consume little power. The current supplies in the receiving and sleep modes are and , respectively.

(a) LoRaCP node
(b) LoRaCP controller
Fig. 4: LoRaCP hardware prototypes. (The Raspberry Pi is for fast prototyping only; it will not be needed if LoRa is built into the LMWN platform.)

A possible concern about LoRaWAN is its low data rate to power consumption ratio (DPR), compared with other low-power radios. For instance, with SF7 in the EU band, the DPR is . In contrast, the DPR for ZigBee is . However, the severity of this concern should be discriminated regarding the aimed communication range. We illustrate this by an example of moving bits of data over a distance of meters by multiple hops. The radio energy used to move the bits over a hop is , where and are the transmitting and receiving powers, respectively; is the link data rate in bps. Thus, the total energy used by the network’s radios to move bits over meters is , where is the typical one-hop transmission range. Considering , we set to be 1 and 10 for LoRaWAN and ZigBee, respectively. Moreover, we set the data rate to be and for LoRaWAN and ZigBee, respectively. After applying respective power consumption measurements, LoRaWAN’s total radio energy consumption is 2.94 times of ZigBee’s. Although the above simplistic energy consumption estimation does not consider other factors like nodes’ processor energy consumption and MAC, the result underlines our understanding. While LoRaWAN consumes more energy than ZigBee, it substantially simplifies the control-plane network design due to its one-hop nature. Moreover, the concern of LoRaWAN’s higher energy consumption can be mitigated by the fact that the control plane’s traffic volume is much lower than the data plane’s. For instance, as measured in §VI, the number of CTP-SCDP’s control-plane frames is just about 5% of its data-plane packets. Thus, we believe that, for the control-plane networks, the energy saving by using high-DPR but short-range radios is not worth sacrificing network simplicity.

Iv-B3 Latency profiling

Fig. 5: Radio awaking latency.
Fig. 6: A communication session.

Under TDMA, the LoRaWAN radio can sleep to save energy while waiting for the next time slot. The time delays in awaking the radio and transmitting a frame are critical to the radio’s sleep scheduling and clock synchronization required by TDMA, respectively. We measure the latency in awaking the radio from the RPi using the shield’s C++ API. Fig. 6

shows the distribution of the awaking latency over 500 tests. The mean and standard deviation are

and , respectively. The small standard deviation suggests that a LoRaCP node can awake the radio punctually for the next TDMA time slot.

Then, we measure the latency in transmitting an uplink frame. Fig. 6 illustrates the uplink transmission’s timing. The node starts and completes the transmission when its clock values are and , respectively. The controller starts and completes the reception when its clock values are and , respectively. We can record , , and in the LoRaWAN shield’s and concentrator’s C++ user programs running at their RPis. To measure the uplink latency, we synchronize the clocks of the node’s and controller’s RPis using the Network Time Protocol (NTP) over an Ethernet that gives sub-ms synchronization accuracy. We define the uplink latency as .111We do not use because it contains a non-negligible uncertain delay from the actual completion of the transmission to the LoRaWAN shield’s C++ library’s periodic pull of the event from the shield’s hardware interface. Thus, the latency is determined by the data rate, which further depends on SF, and the frame size. Fig. 7 shows the box plots of the measured uplink latency under different SFs and frame sizes. As the latency has little variations under each setting, the boxes and whiskers of the plots are not visible. We can see that the latency increases with both frame size and SF, which are consistent with our understanding. Interestingly, for a certain SF, the latency exhibits step changes when the frame size increases. This is because each LoRa frame is a certain number of bits aligned for easy hardware handling. The above measurement results lay a foundation for developing LoRaWAN clock synchronization in §V-C1.

Fig. 7: Uplink latency under different SFs and frame sizes.

V Design and Implementation of LoRaCP

V-a System Overview and LoRaCP-MAC

The goal of LoRaCP is to use LoRaWAN’s uplinks and downlinks to transmit network reports from the nodes to the controller and network commands from the controller to the nodes, respectively. The network commands have two categories: a reactive network command to a node is in response to a precedent network report from the node, whereas an active network command is initiated by the controller. All the control-plane transmissions are managed by LoRaCP’s MAC protocol as illustrated in Fig. 8, which we call LoRaCP-MAC. As discussed in §IV-A, LoRaWAN has six concurrent uplink channels. Five of them use TDMA, while the remaining one (called urgent channel) uses ALOHA to transmit urgent frames. The five concurrent TDMA channels increase the throughput for the network reports. The urgent channel mitigates the rigidness of TDMA and allows the control-plane application developers to deal with urgent situations such as sudden strong interference or even malicious jamming to the data-plane network. As the TDMA channels have different data rates, their time slot lengths can be different to achieve the same maximum frame size. The time slots of a TDMA channel are allocated in a round-robin fashion to the LoRaCP nodes that use the channel. The LoRaCP nodes can be assigned to the TDMA channels to balance their time delays in waiting for the next time slot, while considering the channels’ communication ranges and the nodes’ distances to the controller.

Fig. 8: Illustration of LoRaCP-MAC. The number in a slot represents the ID of the node assigned to the slot. Shaded slots mean heartbeat slots.

We now present two features of LoRaCP-MAC that address LoRaWAN’s downlink-uplink asymmetry and lossy links.

Heartbeat time slots: When a node has no uplink data to transmit, it can skip its next time slot. However, because any downlink frame must be in response to a precedent uplink frame, LoRaCP-MAC designates periodic heartbeat time slots for each node. For instance, in Fig. 8, the shaded blocks represent heartbeat slots. In Channel 1, the heartbeat period for each node is three time slots. A node must transmit an uplink frame in a heartbeat slot. This open a downlink window to maintain the clock synchronization of the node (cf. §V-C1) and send active network commands. The heartbeat period can be set according to the nodes’ clock drift rates and the required clock accuracy to avoid TDMA panic. The heartbeats also help the LoRaCP controller be aware of whether a node is still alive.

Negative acknowledgment (NAK): To deal with frame losses, acknowledging all concurrent uplink transmissions is wasteful because of the downlink-uplink asymmetry. Thus, LoRaCP uses the NAK scheme. In LoRaWAN, the uplink and downlink frames from/to a node have continuously increasing counters, respectively. Thus, both the controller and the nodes can detect if there are lost frames by checking the continuity of the frame counters. If the controller detects lost frames, it sends an NAK using the subsequent downlink transmission to notify the node, which can then use the urgent channel or wait for the next TDMA slot to resend the lost data. The node can also send NAK using the urgent channel or the next TDMA slot to request lost frames. With the NAK scheme, the controller needs not respond to a node’s network report if there is no network commands for the node and no lost frames. This design mitigates the contention for the downlink time.

V-B Software Architectures of LoRaCP Node and Controller

In §IV-B, we have introduced the hardware prototypes of the LoRaCP node and controller. This section presents their software architectures as illustrated in Fig. 9.

V-B1 LoRaCP node

A C++ forwarder program LoRaCPFwd runs on the RPi to buffer and forward the data between Kmote and the LoRaWAN shield, while following LoRaCP-MAC. The node parts of the clock synchronization and TDMA are also implemented in LoRaCPFwd. The Kmote runs TinyOS. We design a TinyOS module LoRaCPC that provides the AMSend and Receive interfaces to send and receive data to/from the RPi through serial communications. Thus, in our prototype design, the Kmote uses LoRaWAN as a service.

Fig. 9: Software architectures of LoRaCP controller and node. (The illustration includes a ZigBee radio for the controller to be a control-plane sink.)

V-B2 LoRaCP controller

The RPi of the controller runs an open-source LoRaWAN server architecture [23] consisting of packet_forwarder, LoRa Gateway Bridge, LoRa Server, and LoRa App Server. This architecture, through providing JSON-based APIs to subscribe/send messages from/to the LoRaWAN network, greatly simplifies the design of centralized network control applications. The role of this architecture is similar to that of an SDN controller platform (e.g., OpenDaylight) that facilitates the design of SDN control applications. In this paper, the centralized network controls and the controller parts of the clock synchronization and TDMA are implemented in a single Python program called LoRaCPApp. Note that the LoRaWAN server architecture [23] supports multiple LoRaWAN gateways. Although this paper focuses on a single LoRaCP controller, in the future work, the multi-gateway support can be exploited to develop redundant LoRaCP controllers to improve the system’s reliability against a single point of failure.

V-C Implementation of LoRaCP Components

This section provides implementation details of LoRaCP’s clock synchronization and TDMA.

V-C1 Clock synchronization

Clock synchronization is a basis for implementing TDMA. Although there are various existing clock synchronization protocols for LMWNs (e.g., FTSP), if we synchronize the LoRaCP nodes to the controller using the data-plane network, the control plane’s TDMA will depend on the data-plane network, incurring the undesirable coupling. Thus, we synchronize the LoRaCP nodes to the controller using the control-plane network. However, there is still limited research on clock synchronization over LoRaWAN. In our prototype system, the RPi’s clock is used as the node’s or controller’s clock. Although the LoRaWAN devices and the Kmote have their own timers, using the RPi’s clock can simplify the evaluation of the accuracy of the LoRaWAN clock synchronization using the RPi’s Ethernet interface.

To save the downlink time, LoRaCP does not prescribe dedicated frames for clock synchronization. Instead, LoRaCP piggybacks several bytes to each control-plane frame for clock synchronization. Specifically, each uplink frame is appended with the node’s clock value as illustrated in Fig. 6. The controller records its clock value on completion of the frame reception. The clock offset between the node and the controller, denoted by , can be estimated as , where is the uplink latency presented in Fig. 7. Then, the controller piggybacks onto the downlink frame as illustrated in Fig. 6. Upon receiving , the node resets its clock by , where denotes the node’s current clock value. Alternatively, the node’s clock advance speed can be calibrated according to using a negative feedback loop.

We now discuss several implementation issues of the above clock synchronization approach. First, the LoRaWAN frame header added by the shield has changeable size because the integers in the headers are represented as variable length hexadecimal ADSII strings. As shown in Fig. 7, the uplink latency has a complex relationship with the frame size in different channels. When the LoRaCP controller receives the uplink frame, it checks the actual frame size and the SF used by the node to query the corresponding from the data in Fig. 7. Thus, for LoRaWAN clock synchronization, the prior knowledge in Fig. 7 is critical. Note that most LMWN clock synchronization approaches are free from this frame size dependence issue because they use dedicated synchronization frames with fixed sizes or the frame size has little impact on transmission latency. Second, we modify packet_forwarder, i.e., LoRaWAN concentrators’ driver program, to record , because other components of the LoRaWAN server architecture may suffer software delays. As illustrated in Fig. 9, the timestamp , together with the corresponding source ID and frame ID, are written into a Redis in-memory database and then retrieved by the LoRaCPApp to compute .

We measure the synchronization accuracy of the above approach using the ntpdate tool to check the clock offset between the node and the controller over a local Ethernet network connecting the RPis. The mean absolute synchronization error is with a standard deviation of . Given the second-level frame transmission time, such synchronization errors of a few milliseconds are satisfactory.

V-C2 Tdma

The prototype LoRaCP node controls the sleep of the LoRaWAN radio and transmissions of frames based on its RPi’s synchronized clock. Specifically, if LoRaCPFwd has received a network report from the Kmote, the RPi starts awaking the LoRaWAN radio before its next TDMA time slot, transmits the report in the time slot, receives any subsequent network command, re-transmits frames using the urgent channel if an NAK is received. Finally, LoRaCPFwd forwards all received network commands to the Kmote. In our current experimental implementation, we assign LoRaWAN channels and time slots to nodes manually. The non-essential operations such as the automatic channel and time slot assignments, support of adding and dropping nodes at run time, etc, are left to future work.

Vi Performance Evaluation

Vi-a Experiment Methodology and Settings

We use LoRaCP to implement the CTP-SCDP discussed in §III-A. Specifically, if the Kmote of a LoRaCP node detects a change of ETX with any of its neighbor node, it uses the LoRaCPC to send the latest ETX using an network report frame to the LoRaCP controller. Upon receiving the ETX update, the controller’s LoRaCPApp python program computes the optimal routing and pushes network commands containing new parent node information to the downlink queue of the LoRaWAN server architecture. Upon receiving a network command, a LoRaCP node updates its parent node accordingly. In the data plane, each node generates a data packet every eight seconds.

We conduct experiments on a testbed consisting of a LoRaCP controller and 15 LoRaCP nodes. The nodes are placed at the grid points of a lab space. The nodes are evenly divided to use three LoRaWAN channels (SF7, SF8, and SF9). The time slot lengths in these three channels are 3, 4, and 5 seconds, respectively. The controller uses the first downlink window RX1 to transmit network commands. Before the RX1 window, the controller has a wait time of one second to compute the network commands, which is generally sufficient. On our 16-node testbed, each LoRaCP has a time slot every 25 seconds or less. For larger networks, to maintain this rotating period for each node, multiple geographically distributed nodes in the same channel can be assigned to use the same time slot, since they unlikely report ETX changes at the same time. We leave the evaluation of this extended LoRaCP-MAC to future work after we expand our testbed.

Vi-B Experiment Results

We conduct three sets of experiments: §VI-B1 pressure-tests LoRaCP; §VI-B2 evaluates the control plane performance of CTP-SCDP; §VI-B3 compares CTP and CTP-SCDP.

Vi-B1 Control plane pressure tests

(a) Downlink delay.
(b) Downlink FDR.
Fig. 10: Control plane pressure test results.

While the concurrent uplink channels increase the throughput for network reports, LoRaWAN’s downlink-uplink asymmetry presents a bottleneck for the downlink communications. We pressure-test the downlink performance. Specifically, each LoRaCP node transmits a network report every its time slot. Thus, the controller receives frames from the three channels concurrently almost at all the time. It replies to each network report with a certain probability. The frame size of the replies ranges from 29 to 33 bytes. NAK is turned off in these tests.

Fig. 10 shows the average control-plane downlink delays and frame delivery ratios (FDRs) of different channels versus the probability that the controller replies. The downlink delay is measured as the time duration between i) the controller’s LoRaCPApp pushes a network command to the LoRaWAN server architecture and ii) the node’s LoRaCPFwd receives the command. This downlink delay includes the wait time of one second. From Fig. 10(a), the average downlink delay does not significantly increase with the controller’s reply probability. The average delay ranges from to . It increases with the SF, because a larger SF has a lower data rate. Fig. 10(b) shows the control-plane downlink FDR versus the controller’s reply probability. The FDR decreases with the reply probability. This is because the open-source LoRaWAN server architecture [23] drops frames when it receives excessive frames to be transmitted beyond the downlink throughput. From the results in Fig. 10, the downlink bottleneck mainly affects the downlink FDR. Thus, in the remaining experiments, we use the downlink FDR to assess whether the control plane performance is throttled by the downlink-uplink asymmetry.

Vi-B2 Control plane performance in CTP-SCDP

(a) Per-node energy consumption for the control plane in one hour.
(b) Downlink FDR.
Fig. 11: CTP-SCDP control plane performance under Wi-Fi interference against the data-plane network. The error bar represents min and max values.

We evaluate the performance of CTP-SCDP’s control-plane network. To create data-plane link quality variations, we use a laptop placed close to the testbed to generate Wi-Fi traffic to interfere with the ZigBee data-plane network. ZigBee radios use Channel 18 and the Wi-Fi AP uses Channel 6, which interfere with each other. On the laptop, we use iperf3 to generate data traffic at a specified bit rate. This experiment methodology well captures the increasingly crowded ISM band used by the ZigBee-/BLE-based data-plane networks. In the presence of the Wi-Fi interference, the CTP-SCDP generates more control-plane messages to report the volatile link ETXes of the data-plane network to the LoRaCP controller.

First, we estimate the energy consumption of each LoRaCP node’s LoRaWAN shield by multiplying the transmitting/receiving currents with the measured total times in respective modes. Fig. 11(a) shows the error bars of per-node energy consumption by the shield in one hour under different settings of heartbeat period and Wi-Fi interference intensity. The control-plane energy consumption increases with the interference intensity due to the increased control-plane messages. When we do not generate Wi-Fi interference, the energy consumption decreases with the heartbeat period. This is because, in the absence of the interference, the link ETXes seldom change and most control-plane messages are the heartbeats. In the presence of interference (i.e., and ), the energy consumption has no monotonic relationship with the heartbeat period, because the node will utilize the non-heartbeat time slots to report the volatile ETXes. From Fig. 11(a), with no and intensive interference (), the per-node power consumption by the control plane averaged over time is about and , respectively, which are comparable to or lower than the power consumption of low-power microcontrollers (MCUs). For instance, the active power of TelosB’s MCU is , whereas the recent Firestorm’s MCU consumes in the common configuration [7].

Second, we measure the average control-plane downlink FDR over all channels. The results are shown in Fig. 11(b). Even if the data-plane network experiences intensive interference, the FDR is generally above 90%. Thus, the CTP-SCDP’s control plane is still beyond the downlink bottleneck.

Vi-B3 Comparison between CTP and CTP-SCDP

(a) Data-plane PDR and control-plane downlink FDR.
(b) Control-plane uplink frames and per-node energy in one hour.
Fig. 12: Performance comparison between CTP and CTP-SCDP.

We load CTP to eight nodes and CTP-SCDP to another eight nodes. We run CTP and CTP-SCDP side by side on the testbed, so that they experience almost the same Wi-Fi interference for fair comparison. CTP-SCDP’s LoRaCP heartbeat period is 10. Fig. 12(a) shows the data plane’s packet delivery ratio (PDR), i.e., the ratio of the ZigBee packets received by the data-plane sink over all packets generated by the source nodes. When the Wi-Fi interference intensity is low (), CTP and CTP-SCDP achieve similarly high PDRs. When the interference intensity is , CTP-SCDP’s PDR is 10% higher than CTP’s. When the interference intensity is , CTP’s PDR drops to 65%, while CTP-SCDP’s is 80%. Note that the actual data rate of the Wi-Fi interference traffic fluctuates over time. Moreover, the fluctuation level increases with the setpoint. The data rate deviations are only and up to for setpoints and , respectively. Thus, the control-plane networks experience more dynamic interference with a larger setpoint, resulting in the increasing PDR gain of CTP-SCDP over CTP with the interference intensity setting. This result is consistent with our observation from the simulation study in §III-A that CTP cannot handle dynamic network conditions well.

Fig. 12(a) also shows the control-plane downlink FDRs, which are above 97%. This suggests that the control plane is beyond the downlink bottleneck. Fig. 12(b) shows the total number of control-plane uplink frames of CTP-SCDP during one hour and the projected per-node energy consumption by the LoRaWAN shield. In the presence of stronger interference, more uplink frames will be transmitted to report the volatile ETXes. With and interference, the total numbers of data-plane transmissions (including beacons and forwarded packets) are 5,022 and 10,024, respectively. The corresponding numbers of control-plane uplink frames are just 5.2% and 6.7% of these data-plane transmissions. With strong interference (), the per-node control-plane power consumption averaged over time is less than , consistent with the results in Fig. 11 obtained with 15 nodes.

Vii Conclusion and Future Work

This paper studied using LoRaWAN radios to form one-hop out-of-band control planes for LMWNs through extensive measurement study, system design, and testbed evaluation. We demonstrated applying the designed system, LoRaCP, to physically separate the control plane of CTP from its ZigBee-based data-plane network. Experiments show that LoRaCP increases CTP’s packet delivery ratio from 65% to 80% in the presence of external interference, while additionally consuming a per-node average radio power of . In future work, we will evaluate systematically the network performance and manageability gains by LoRaCP, as well as its impact on node lifetime in real-world environments such as factories.

References

  • [1] “Mesh networks form backbone of smart cities,” http://bit.ly/2qAHOkV.
  • [2] “Creating the world’s largest network of connected vehicles for smart cities,” http://bit.ly/2q5Jgs3.
  • [3] “Bluetooth LE: mesh,” http://bit.ly/2thA6ve.
  • [4] “WirelessHART installed networks exceed 8,000 at major manufacturing sites worldwide,” http://bit.ly/2qAXox0.
  • [5] O. Gnawali, R. Fonseca, K. Jamieson, D. Moss, and P. Levis, “Collection tree protocol,” in SenSys, 2009.
  • [6] H. Huang, S. Guo, W. Liang, K. Li, B. Ye, and W. Zhuang, “Near-optimal routing protection for in-band software-defined heterogeneous networks,” J-SAC, vol. 34, no. 11, 2016.
  • [7] M. P. Andersen, G. Fierro, and D. E. Culler, “System design for a synergistic, low power mote/ble embedded platform,” in IPSN, 2016.
  • [8] “Wireless SX1276 LoRa module,” http://modtronix.com/inair9b.html.
  • [9] K. Habak, K. A. Harras, and M. Youssef, “Bandwidth aggregation techniques in heterogeneous multi-homed devices: A survey,” Computer Networks, vol. 92, pp. 168–188, 2015.
  • [10] S. Kandula, K. Lin, T. Badirkhanli, and D. Katabi, “FatVAP: Aggregating AP backhaul capacity to maximize throughput.” in NSDI, 2008.
  • [11] G. Ekbatanifard, P. Sommer, B. Kusy, V. Iyer, and K. Langendoen, “Fastforward: High-throughput dual-radio streaming,” in MASS, 2013.
  • [12] S. Nirjon, A. Nicoara, C.-H. Hsu, J. P. Singh, and J. A. Stankovic, “MultiNets: A system for real-time switching between multiple network interfaces on mobile devices,” ACM TECS, vol. 13, no. 4s, p. 121, 2014.
  • [13] D. Mu, Y. Ge, M. Sha, S. Paul, N. Ravichandra, and S. Chowdhury, “Adaptive radio and transmission power selection for internet of things,” in IWQoS, 2017.
  • [14] Y.-s. Lim, Y.-C. Chen, E. M. Nahum, D. Towsley, R. J. Gibbens, and E. Cecchet, “Design, implementation, and evaluation of energy-aware multi-path tcp,” in CoNEXT, 2015.
  • [15] A. Nikravesh, Y. Guo, F. Qian, Z. M. Mao, and S. Sen, “An in-depth understanding of multipath tcp on mobile devices: measurement and system design,” in MobiCom, 2016.
  • [16] N. A. Jagadeesan and B. Krishnamachari, “Software-defined networking paradigms in wireless networks: a survey,” ACM Comput. Surv., vol. 47, no. 2, p. 27, 2015.
  • [17] P. Dely, A. Kassler, and N. Bayer, “OpenFlow for wireless mesh networks,” in ICCCN, 2011.
  • [18] M. Kaplan, C. Zheng, M. Monaco, E. Keller, and D. Sicker, “WASP: a software-defined communication layer for hybrid wireless networks,” in ANCS, 2014.
  • [19] “Tinyos production,” https://github.com/tinyprod.
  • [20] P. Marcelis, V. S. Rao, and R. V. Prasad, “DaRe: Data recovery through application layer coding for lorawans,” IoTDI, 2017.
  • [21] “LoRaWAN shield for Raspberry Pi,” http://bit.ly/2rwyL67.
  • [22] “iC880A - LoRaWAN concentrator 868mhz,” http://bit.ly/2rwmpuQ.
  • [23] “LoRa server system architecture,” http://bit.ly/2vcIaSw.