Challenges, Designs, and Performances of a Distributed Algorithm for Minimum-Latency of Data-Aggregation in Multi-Channel WSNs

In wireless sensor networks (WSNs), the sensed data by sensors need to be gathered, so that one very important application is periodical data collection. There is much effort which aimed at the data collection scheduling algorithm development to minimize the latency. Most of previous works investigating the minimum latency of data collection issue have an ideal assumption that the network is a centralized system, in which the entire network is completely synchronized with full knowledge of components. In addition, most of existing works often assume that any (or no) data in the network are allowed to be aggregated into one packet and the network models are often treated as tree structures. However, in practical, WSNs are more likely to be distributed systems, since each sensor's knowledge is disjointed to each other, and a fixed number of data are allowed to to be aggregated into one packet. This is a formidable motivation for us to investigate the problem of minimum latency for the data aggregation without data collision in the distributed WSNs when the sensors are considered to be assigned the channels and the data are compressed with a flexible aggregation ratio, termed the minimum-latency collision-avoidance multiple-data-aggregation scheduling with multi-channel (MLCAMDAS-MC) problem. A new distributed algorithm, termed the distributed collision-avoidance scheduling (DCAS) algorithm, is proposed to address the MLCAMDAS-MC. Finally, we provide the theoretical analyses of DCAS and conduct extensive simulations to demonstrate the performance of DCAS.

READ FULL TEXT VIEW PDF

Authors

page 1

09/23/2019

A Heuristic for Maximizing the Lifetime of Data Aggregation in Wireless Sensor Networks

Recently, many researchers have studied efficiently gathering data in wi...
01/27/2021

Harvest: A Reliable and Energy Efficient Bulk Data Collection Service for Large Scale Wireless Sensor Networks

We present a bulk data collection service, Harvest, for energy constrain...
12/08/2017

Data Aggregation Over Multiple Access Wireless Sensors Network

Data collection in Wireless Sensor Networks (WSN) draws significant atte...
04/22/2019

GLS and VNS Based Heuristics for Conflict-Free Minimum-Latency Aggregation Scheduling in WSN

We consider a conflict-free minimum latency data aggregation problem tha...
01/11/2021

Exploiting a Fleet of UAVs for Monitoring and Data Acquisition of a Distributed Sensor Network

This study proposes an efficient data collection strategy exploiting a t...
01/05/2021

Multi-Cell, Multi-Channel URLLC with Probabilistic Per-Packet Real-Time Guarantee

Ultra-reliable, low-latency communication (URLLC) represents a new focus...
07/13/2021

Monotonic Filtering for Distributed Collection

Distributed data collection is a fundamental task in open systems. In su...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In Wireless Sensor Networks (WSNs), one very important application is periodical data collection. Generally, data collection can be viewed under two stages as data generation, in which the data are periodically generated by sensors, and data aggregation, in which after the generated data from the sensors compressed by using some aggregation functions, e.g. MAX, MIN, SUM, and etc. are reported to a specific node, called the sink. Since the data generation was fulfilled because of the development of sensing capacity of the sensors, the prerequisite for deciding the performance of WSNs is data aggregation capacity that reflects how fast data been collected by the sink.

To enable efficient data collection in the WSNs, for the last years there are a lots of effort which aimed at the routing algorithm development with collision free when a fixed number of data are allowed to be aggregated into one packet [1]. It is worth mentioning that there exists a problem called minimum-latency collision-avoidance multiple-data-aggregation scheduling (MLCAMDAS) [2] that is successfully investigated minimum the latency of data collection with a flexible aggregation ratio in WSNs, in which is the maximum number of data allowed to be aggregated into one packet. However, in reality, to guarantee the data collision free sensors are not only assigned to time slots reasonably but also need to be set communication channels. It is essential to guarantee the collision free and there is no above research works fully consider a communication channels scheduling for the data collection application in the WSNs. In addition, most of existing works studied the data collection issue in the WSNs under an ideal assumption that the entire network is completely synchronized with full knowledge of components. It is commonly known as centralized wireless sensor network or generally centralized system. Likewise, many centralized algorithms based on the above assumption are designed and analyzed with nice performance to solve problems of data collection in the centralized wireless sensor network. In partial application, the WSNs are more likely to be distributed systems, in which each sensor’s knowledge is disjointed to each other, even the sink is not with full information of the network either. This is a formidable motivation for us to investigate the problem of minimum latency for the data aggregation application without data collision in the distributed WSNs when the sensors are considered to be assigned the channels and the data are compressed with a flexible aggregation ratio, termed the minimum-latency collision-avoidance multiple-data-aggregation scheduling with multi-channel (MLCAMDAS-MC) problem.

During studying the MLCAMDAS-MC problem, many new challenges are realized and compared with that in previous works. We summarize these main challenges as follows.

  • C1: To guarantee the collision free, every sensor node are not only required to be assigned to time slots, but also need to be set to appropriate channels. Unlike in the centralized system, they can collect overall information of the network and give the time slot and the channel assignment evaluations based on all sensor nodes’ information such as the time clock, the packet size, and the links between the sensors etc. to schedule collision free algorithms for data transmissions in the network, in a distributed WSN, we have to guarantee the collision free based on only local information of each sensor node. It is clearly much harder and requires a complex technique to provide the collision free algorithms for data collection in the distributed WSNs.

  • C2: To achieve the minimum latency for data collection, it requires an efficient distributed algorithm in the distributed WSN. In the centralized wireless sensor network, they can provide an optimized algorithm for data transmissions in the network because the entire network is synchronized completely with full knowledge information of all components. As always, following those existing optimized algorithms are no longer suitable for a distributed WSN. Thus, how to design an optimized distributed algorithm for data collection in the distributed WSNs is a challenge.

  • C3: The third challenge is how to theoretically analyze the framework to design an actual distributed algorithm. Since there is no way to get exactly the parameters of multiple sensors at the same time, it is difficult to determine relevant routings for the data transmissions of the sensor nodes. The question of guarantee for minimum data collection’s latency and collision free becomes harder and harder. Hence, the performance of the desired algorithm not only depends on an comprehensive evaluation but also requires the specific techniques and mechanism for the data transmissions.

To address these challenges, we proposed a new distributed algorithm, termed the distributed collision-avoidance scheduling (DCAS) algorithm to let every sensor iteratively schedule local data transmissions for each time slot by its 3-hop neighboring information, until no more data are required to be scheduled by . On the comprehensiveness, we implement our proposed method through simulations and analyses. We summarized the main contributions of this paper as follows.

  • We study the problem of finding a schedule of forwarding data to the sink without data collisions such that the number of required time slots is minimized, termed the MLCAMDAS-MC problem. In addition, the difficulty of the MLCAMDAS-MC problem is provided.

  • We introduce an extended relative collision graph to represent the collision relation between any data transmission for the MLCAMDAS-MC problem. Based on the obtained , we propose a new distributed algorithm, termed the distributed collision-avoidance scheduling (DCAS) algorithm.

  • Theoretical analyses of the DCAS show its correctness. It indicates that the data transmissions scheduled by the DCAS are collision-free in each time slot as well as no sensors are in a circular wait for making schedules of data transmissions in the DCAS. We also conduct the simulations to demonstrate the performance of the DCAS. The results show that the DCAS provides better performance than the existing solution used for the MLCAMDAS-MC problem with an adjustable data aggregation ratio .

Organization: The remaining sections of this paper are organized as follows. A summary of the related works is in Section II to give the readers a whole picture on the data collection and the distributed WSN. In Section III-A, the network model is introduced. The MLCAMDAS-MC problem and its difficulty are illustrated in Section III-B. We introduce an extended relative collision graph in Section IV. According to the , the DCAS algorithm is presented in Section V. The theoretical analyses are provided in Section VI. In Section VII, we evaluate the performance of the DCAS. Finally, this paper is concluded in Section VIII.

Ii Related Work

Many research works have studied to improve the efficiency of data collection for both the centralized WSNs [3, 2] and distributed WSNs [4, 5]. In [6], the authors proposed a chain-based protocol, named PEGASIS to reduce the energy consumption of sensors during data collection process. The idea of the PEGASIS is to collect data through a connected chain through sensors to the sink. The authors in [7] improve the PEGASIS by grouping the sensors into clusters. The data are forwarded to the sink from the sensors through the cluster head. However, in these studies they do not consider to eliminate the data collisions of data transmissions. In [5], a distributed data collection algorithm is proposed to increase the data collection capacity, in which the network is organized as a connected dominating set and the data can be collected through the dominators. Even the authors in [5] claim to be distributed algorithm; unfortunately, during the construction of the Connected Dominating Set for the network, they have accidentally treated the network as an centralize system, so that it cannot be applied for an completely distributed network.

In WSNs, sensor may incur the data collision with the others during the data transmission, resulting in data loss or failure. In recent years, there has been much effort to to improve the latency of data collection as well as eliminate the data collision. The time-division multiple access (TDMA) is one of the most common channel access techniques used medium access control (MAC) protocol to allow multiple sensors to transmit data without collisions at the same time slot. In [8], the authors proposed a novel tree-based TDMA scheduling, named the traffic pattern oblivious (TPO) to achieve data collision free for data collection. In this work, all sensor nodes in the network are step by step assigned to time slots to avoid the data collisions. Since the network is structured as a tree structure, the time slot assignment is conducted from the leaf nodes forward. The authors in [9] proposed the node-based scheduling algorithm (NBSA) and the level-based scheduling algorithm (LBSA) to minimize the latency of data collection. In this work, the NBSA and the LBSA use a color graph to represent the data collision between sensors’ data transmission. Essentially, the above studies achieve the data collision free and data collection capacity. However, the data routings are limited in using the tree-based methods to minimize the latency of data collection. In addition, the raw data are not aggregated before transmitting through the network.

Since reducing the sensors’ energy consumption is a big challenge of data collection application in WSNs, data aggregation technology appears as the best solution to allow data to be aggregated in the data collection. Data aggregation uses the functions of MIN, MAX, SUM and COUNT etc. to aggregate multiple packets into one packet. In [4, 10], the authors investigated the construction of data aggregation with minimum energy cost in WSNs, in which the data aggregation technique is applied to reduce the size of data packets. The authors in [11, 12] proposed methods based on the connected-dominating-set (CDS) tree to minimize the latency and achieve collision free in WSNs. In the studies, the main idea is to organize the network as a CDS tree, in which each sensor is treated as either a dominator or a dominatee. The dominators are responsible to aggregate all data from dominatees and the data collisions are eliminated by using the TDMA. In these research works, the authors studied the data collection issue under an ideal assumption that the network is structured with the tree root be the topology center. The communication capacity of sensors in WSNs is recently enhanced by applying the multi-channel technology, in which sensors can use the IEEE 802.15.4 protocol with 16 non-interference channels [13, 14, 15]. By this way, the sensors can use different channels to transmit data without collisions in the same time slot. Hence, the number of used time slots in the data collection can be reduced, resulting in the improvement of the latency. In [3], the authors introduced an idea to using maximum the number of channels to schedule the data transmission of sensors in one time slot; however, since the number of channels is limited, it is possible of occurring data collisions. Eventually, it requires a perfect scheduling to combine the time slot assignment and the channel assignment for sensors to achieve the minimum latency of data collection in WSNs.

Iii Preliminaries

In this section, we first describe the network model for a WSN in Section III-A. Based on the network model, the Minimum-Latency Collision-Avoidance Multiple-Data-Aggregation Scheduling with Multi-Channel (MLCAMDAS-MC) problem and its difficulty are proposed and discussed in Section III-B.

Iii-a Network Model

The WSN is composed of sensors, where a sensor can communicate with other sensor if and only if they are within each other’s transmission range. Hereafter, a sensor is said to be a sensor ’s neighboring sensor if and only if sensors and can communicate with each other. In this paper, the unit disk graph model is employed as the communication model [16], in which all sensors are assumed to have the same transmission range, denoted by . Because sensors are responsible for periodically sensing environmental information, sensing data are periodically generated from sensors and reported to a sink, where the sink is a special node in the network and is responsible for data collecting, processing, and analysis. The wireless sensor network can then be represented as a connected weighted graph [2], where is the set of sensors in the network, edge represents that sensors and can communicate with each other, , the weighting function of sensor , represents that the number of units of raw data generated by to report to the sink per a period of time. Fig. 1 shows a WSN represented by a connected weighted graph , which includes one sink and sensors , , , , , , and . The numbers of units of raw data generated by , , , , , , and per a period of time are , , , , , , and , respectively.

Fig. 1: Example of a connected weighted graph , where node is a sink, and the number close to each node represents the number of units of raw data generated by the corresponding node.

In the WSN, because the transmission range is limited, sensors are often hard to communicate with the sink directly, and the data generated from sensors often have to be forwarded to the sink via multiple sensors. When raw data are allowed to be aggregated into packets, it may have fewer packets required to be forwarded, and thus, reporting data to the sink becomes more efficient. Here, by the data aggregation model in [2], we assume that at most units of raw data are allowed to be aggregated into one unit-size packet, where is also called aggregation ratio in this paper. Let denote the number of units of raw data that are required to be forwarded by sensor . The number of unit-size packets required to be forwarded by , denoted by , is defined as follows:

(1)

Take Fig. 1, for example. It is clear that sensors and generate and units of raw data, respectively, in the beginning of a time period, and therefore, and initially. When the aggregation ratio is assumed to be , we have that and . If forwards a unit-size packet that aggregates three units of raw data to , will have units of raw data required to be forwarded, and we have that and . In addition, we also have that and .

In the WSN, by the time division multiple access (TDMA), we assume that sensors are synchronized, and time is divided into time slots such that a unit-size packet can be transmitted successfully from one sensor to its neighboring sensor within one time slot if no data collision occurs [17, 9, 18]. In these studies, each sensor is assumed to have a multi-channel half-duplex transceiver such that each sensor can switch channels and use one of the channels to transmit data to another sensor using the same channel at a given time slot. In addition, each sensor cannot transmit and receive data simultaneously, and cannot receive data from multiple sensors in the same time slot. In the WSN, when multiple sensors use the same channel to transmit data at the same time slot, data collisions may occur such that the data have to be resent by using one or more time slots. Here, a data collision will occur at one sensor if the sensor attempts to transmit two or more packets to multiple sensors, to receive two or more packets from multiple sensors, to transmit and receive packets, or to receive one packet and hears another one at the same time slot [19, 12], which is formally defined in Definition 1.

Definition 1

A data collision is said to be occurred at node using channel at time slot if one of the following conditions satisfied: (C1) transmits two or more packets to sensors at , (C2) receives two or more packets from sensors at , (C3) transmits and receives packets at , and (C4) receives a packet from a sensor and hears another one from sensor at , where and use the same channel .

Take Fig. 2, for example. Figs. 2 and 2 show two examples of data collisions occurred at node . In Fig. 2, when node receives data from node and sends data to node at the same time slot, by the C3 in Definition 1, there is a data collision occurred at node . In Fig. 2, when nodes and use the same channel to send data to nodes and , respectively, at the same time slot, because is within the transmission range of , will hear the data transmission from , which incurs a data collision by the C4 in Definition 1. Note that if and use different channels to send data to nodes and , respectively, because the channel of is the same with that of for receiving data from , the channel of is different from that of , and thus, will not hear the data transmission from and no collision occurs.

Fig. 2: Examples of data collisions that occur at node . (a) shows the data collision when transmits and receives messages at the same time. (b) shows the data collision when receives one message from node and hears another message from node at the same time.

Iii-B Minimum-Latency Collision-Avoidance Multiple-Data-Aggregation Scheduling with Multi-Channel Problem and Its Difficulty

In this paper, while given a WSN , in which sensors have channels to switch, and an aggregation ratio , our problem is to find a schedule of forwarding data to the sink without data collisions such that the number of required time slots is minimized, termed the Minimum-Latency Collision-Avoidance Multiple-Data-Aggregation Scheduling with Multi-Channel (MLCAMDAS-MC) problem. Here, the schedule of forwarding data is a sequence of collision-avoidance schedules , , , , where () is a set of 3-tuple elements (, , ) with ,, , and , and represents a schedule of unit-size packets at time slot without data collisions. In addition, each element (, , ) denotes that one unit-size packet that aggregates units of raw data is scheduled to be forwarded from to by using the -th channel. The MLCAMDAS-MC problem is formally illustrated as follows:

INSTANCE: Given a graph , an aggregation ratio , the total number of channels , and a constant .

QUESTION: Does there exist a schedule of forwarding data, that is, a sequence of collision-avoidance schedules , , , , for forwarding all generated data to the sink, such that the number of required time slots is not greater than ?

Take Fig. 1, for example. We assume that the aggregation ratio = and the total number of channels . Also assume that the sequence of collision-avoidance schedules are , , , , , , , and , where , , , , , , , , , , , , and . Note that all generated data can be sent to the sink within time slots without any data collisions.

The difficulty of the MLCAMDAS-MC problem is provided in Theorem 1.

Theorem 1

The MLCAMDAS-MC problem is NP-complete.

It is clear that the MLCAMDAS-MC problem belongs to the NP class. It suffices to show that the MLCAMDAS-MC problem is NP-hard. Here, the Minimum-Latency Collision-Avoidance Multiple-Data-Aggregation Scheduling (MLCAMDAS) problem [2] is used to show the difficulty of the MLCAMDAS-MC problem. The MLCAMDAS problem is formally illustrated as follows:

INSTANCE: Given a graph , an aggregation ratio , and a constant .

QUESTION: Does there exist a sequence of collision-avoidance schedules , , , for forwarding all generated data to the sink such that the number of required time slots is not greater than ?

It is clear that when the total number of channels , the MLCAMDAS-MC problem is equivalent to the MLCAMDAS problem, which implies that the MLCAMDAS problem is a subproblem of the MLCAMDAS-MC problem. Because the MLCAMDAS problem is NP-hard [2], and therefore, the MLCAMDAS-MC problem is also NP-hard. This thus completes the proof.

Iv Extended Relative Collision Graph

Because the MLCAMDAS-MC problem is to find a schedule of forwarding data to the sink without data collisions, determining collision relation between any data transmission is important to the MLCAMDAS-MC problem. To discover the collision relation, the concept of the relative collision graph is borrowed from the research in [2] and is extended here to represent the collision relation between any data transmission for the MLCAMDAS-MC problem. When each sensor in the WSN has exactly one channel, to minimize the number of required time slots for collecting the generated data, all possible data transmissions that forward data to nodes closer to the sink are considered to construct the relative collision graph [2]. For this purpose, a directed graph , termed the data-forwarding graph, is used to represent all such possible data transmissions in the WSN , where , an edge is included in if and , and (or ) denotes the minimum hop count from node (or ) to the sink in . Take Fig. 3, for example. Fig. 3 illustrates the data-forwarding graph of the WSN shown in Fig. 1. Because and edges , in Fig. 1, edges and are included in . In addition, although edge in Fig. 1, edge or is not included in because (or ) is not greater than (or ).

Fig. 3: Examples of the data-forwarding graph and the relative collision graph of the WSN shown in Fig. 1, where the aggregation ratio is assumed to be . (a) ’s corresponding data-forwarding graph . (b) ’s corresponding relative collision graph , where the left number and the right number in parentheses close to each node represents and , respectively.

By the WSN and its corresponding data-forwarding graph , the corresponding relative collision graph can be constructed to illustrate the collision relation between any data transmission in [2]. In , each node represents a possible data transmission from node to node in , that is, an edge . Each edge represents that a data collision occurs if data are sent from node to node and from node to node at the same time slot and in the same channel. In addition, for each , calculated by is used to represent the weight of node ; and calculated by is used to represent how many extra units of raw data can be aggregated with the data at such that the total number of aggregated unit-size packets in is not increased. When and are given, the corresponding is constructed as follows: is the set of nodes for each ; is the set of edges if edges and a data collision occurs under the condition that data are transmitted from node to node and from node to node at the same time slot and in the same channel; and and for each . Take Fig. 3, for example. When given the WSN , as shown in Fig. 1, and the corresponding data-forwarding graph , as shown in Fig. 3, when the aggregation ratio , the corresponding relative collision graph is shown in Fig. 3. Note that nodes and are included in because edges , . In addition, edge because a data collision occurs by the C1 in Definition 1 when data are transmitted from node to node and from node to node in the same time slot. Because and , we have that , , and .

Because the relative collision graph only considers exactly one channel used by every sensor in the WSN, the extended relative collision graph is therefore presented to represent collision relation with multiple channels for the MLCAMDAS-MC problem. When sensors have channels to switch, each sensor in the WSN can use the -th ( ) channel for data transmission. When all sensors in the WSN use the same -th ( ) channel, the collision relation for all possible data transmission is the same as that in the relative collision graph. In addition, because sensors in the WSN can select one of channels for data transmission, the collision relation between any data transmission using different channels has to be considered in the extended relative collision graph. By the observations, the extended relative collision graph is constructed by including relative collision graphs each representing the collision relation for all possible data transmission with channel ( ). In addition, some edges are inserted between nodes in different relative collision graphs to represent the collision relation for the data transmission using different channels. When and are given, the corresponding extended relative collision graph is constructed as follows: is the set of nodes for each and ; is the set of edges if edges and a data collision occurs under the condition that data are transmitted from node to node by using channel and from node to node by using channel at the same time slot; and and for each . Note that and in have the same definition as that in .

Fig. 4: Example of the extended relative collision graph of the WSN shown in Fig. 1, where the aggregation ratio and the number of channels are assumed to be and , respectively.

Take Fig. 4, for example. When the WSN shown in Fig. 1, the corresponding data-forwarding graph shown in Fig. 3, the aggregation ratio , and the number of channels are given, the corresponding extended relative collision graph is shown in Fig. 4. Because the number of channels , it is clear that the in Fig. 4 consists of two relative collision graphs that each are the same with the as shown in Fig. 3. In addition, edge because a data collision occurs at node by the C1 in Definition 1 when transmits packets to node and at the same time slot. Edge because a data collision occurs at node by the C2 in Definition 1 when receives packets from and at the same time slot. Edge because data collisions occur at nodes and by the C1 and C2 in Definition 1. Edge because a data collision occurs at node by the C3 in Definition 1 when receives a packet from and transmit another one to at the same time slot. Moreover, note that although by the C4 in Definition 1, that is, a data collision occurs at node when receives a packet from node and hears another one from node by using the first channel at the same time slot, because and use different channels to receive and transmit data, respectively.

V Distributed Collision-Avoidance Scheduling (DCAS) Algorithm

By the extended relative collision graph , edge ( , ) represents that a data collision will not occur if data are transmitted from node to node with the -th channel and from node to node with the -th channel at the same time slot. This implies that if is an independent set in , that is, and edge for any , there is no data collision when data are transmitted from node to node with the -th channel for all at the same time slot. Therefore, to avoid data collisions, the idea is to select a suitable independent set from the extended relative collision graph for each time slot. Take Fig. 4, for example. Let , , . Clearly, is an independent set in shown in Fig. 4. It is also clear that no data collision will occur in Fig. 1 when data are transmitted from node to node with the first channel, from node to node with the first channel, and from node to node with the second channel at the same time slot.

To find a suitable independent set from the extended relative collision graph for each time slot, the idea is to find an independent set in that is composed of the nodes with higher such that the nodes with higher number of unit-size packets required to be forwarded by , that is, , or higher minimum hop count from to the sink, that is, , can be selected to minimize the total required time slots, where . To select suitable nodes to form an independent set in , the precedence of nodes, which is used to decide which node has precedence to be selected into the independent set, has to be determined first. Here, for any two nodes and in , is said to have higher precedence than if ; otherwise, if is equal to , the node with higher value has higher precedence because at most (or ) units of raw data can be aggregated at (or ) without increasing (or ); otherwise, if is equal to , the node with smaller channel number has higher precedence; otherwise, if channel is equal to channel , the node with higher ID value ( or ) has higher precedence, where is a pair of and , denoted by (, ), for each ; is assumed to be an unique identification for each ; and (, ) is said to be higher than (, ) if , or ( and . The definition of the precedence of nodes is formally defined in Definition 2.

Definition 2

Given an extended relative collision graph , node is said to have higher precedence over ( ) if or and or and and or and and and , where is a pair of and ; and denotes ’s identification for each .

To design a distributed algorithm for the MLCAMDAS-MC problem, every sensor is assumed to have the information about the minimum hop count from to the sink, that is, , used to evaluate the value of . It can be easily achieved by flooding a message from the sink to all nodes in the networks based on the breadth-first-search mechanism [20]. In addition, every sensor in the networks is also assumed to have limited local information. By the observation in Fig. 2, if node has local information about nodes , , and , and knows that the data transmission from to with channel is a better choice than the data transmission from to with channel , that is, has a local subgraph of and knows that has higher precedence than , will let to schedule the data transmission from to with channel before ’s schedule. In Fig. 2, we have that requires at least -hop neighboring information to compare all possible data transmission that could have data collision with the data transmission from . Therefore, in this paper, we assume that every sensor in the networks maintains -hop neighboring information, that is, every sensor in the networks has a local WSN and a local data-forwarding graph . By and , a local extended relative collision graph can then be constructed. Take Fig. 5, for example. Fig. 5 shows the local information of node that is a node in shown in Fig. 1. Fig. 5 shows ’s local WSN and local data-forwarding graph . Note that node is in because is within -hop distance from . Also note that edge is not in both of and because is not within -hop distance from . Fig. 5 shows ’s local extended relative collision graph . Note that there are nodes in because the number of channels is and four edges exist in . Also note that is a subgraph of induced by the nodes in , where is shown in Fig. 4.

Fig. 5: Examples of node ’s local information, including local WSN , local data-forwarding graph , and local extended relative collision graph , where is a node in shown in Fig. 1. (a) The combination of and . (b) , where the aggregation ratio and the number of channels are assumed to be and , respectively; and the left number and the right number in parentheses close to each node represents and , respectively.

When every sensor maintains 3-hop neighboring information, the idea of the proposed distributed algorithm, termed the distributed collision-avoidance scheduling (DCAS) algorithm, is to let every sensor iteratively schedule local data transmissions for each time slot by its 3-hop neighboring information, until no more data are required to be scheduled by . Here, we use to denote which time slot waited to be scheduled by sensor , and use to store the determined schedules. Initially, for each sensor in the networks, is set to , and is set to . When some sensor makes a schedule for time slot , is incremented by , which is used to represent that is ready for the next time slot. In addition, when some sensor satisfies that , it implies that has not yet made a decision for time slot , and thus, cannot make any schedule due to the lack of ’s information at time slot . Therefore, can make a schedule if for all .

When for all , sensor is checked to see if it is allowed to make a schedule by Procedure SCHEDULE. In Procedure SCHEDULE, the idea is that sensor can make a schedule of transmitting data from to its neighboring sensor with channel if the data transmission from to with channel , represented by , has higher precedence than other possible data transmissions in by Definition 2. When can make a schedule of transmitting data from to with channel , the schedule is inserted into , and a MSGDECISION message with the scheduling information is locally broadcast to all sensors in . After this, is incremented by . When other sensor receives the MSGDECISION message, it will locally update the related information. In addition, if receives the MSGDECISION message, the scheduling information is inserted into , and the MSGDECISION message is locally broadcast to all sensors in .

To avoid data collision, the data transmissions scheduled for the same time slot have to be collision-free, that is, the nodes selected for the same time slot have to form an independent set in . To this purpose, for each sensor in the networks, all nodes in are marked as white when changes to a new time slot. In addition, when any node in is selected by node for scheduling time slot , and its neighboring nodes in are marked as black in the local information of other sensors with . When and its neighboring nodes in are marked as black for time slot , the node selected from the remaining white nodes in will be independent from in for time slot . When one sensor knows that all nodes in are black, it implies that no data transmission can be scheduled by for the current time slot. Then, locally broadcasts a MSGSKIP message to all sensors in about that skips the current time slot and is incremented by .

After scheduling a number of time slots, when a sensor has and satisfies for each of its neighboring sensors , we have that no data will be scheduled to and forwarded by , and thus, finishes its scheduling process. In addition, when a sensor has , and each of its neighboring sensors having finishes the scheduling process, will also finish its scheduling process. When a sensor finishes its scheduling process, a MSGFINISHED message with is locally broadcast to all sensors in . When the sink knows that all its neighboring sensors finish the scheduling processes, the data scheduling for the network is completed. The details of the DCAS algorithm is shown in Algorithm 1.

1:procedure SCHEDULE()
2:     Let be the set of white nodes in
3:     Let be the node having highest precedence in
4:     if  then
5:          , where denotes a minimum function and produces the minimum value of and
6:          , where (, , )
7:          ;
8:         update the values of and for the nodes in
9:         locally broadcast a MSGDECISION message with to all sensors in
10:         all nodes in become white
11:          ;
12:     else if  then
13:         locally broadcast a MSGSKIP message with and to all sensors in
14:         all nodes in become white
15:         
16:     end if
17:end procedure
1:; ;
2:Construct , , and
3:all nodes in are initialized to be white
4: is set to for all
5:while  do
6:     if a MSGDECISION message with (, , ) is received by for the first time and  then
7:          ;
8:         update the values of and for the nodes in
9:         if   then
10:              all nodes in become white
11:              
12:              locally broadcast a MSGDECISION message with to all sensors in
13:         else if   then
14:               and its neighboring nodes in become black
15:         end if
16:          ;
17:     end if
18:     if a MSGSKIP message with and is received by for the first time and  then
19:         if   then
20:              nodes become black for all with or
21:         end if
22:         
23:     end if
24:     if a MSGFINISHED message with is received by for the first time then
25:          is updated as a subgraph of induced by
26:          is updated as a subgraph of induced by , where denotes the set of nodes with or
27:     end if
28:     if  and for all  then
29:         
30:         locally broadcast a MSGFINISHED message with to all sensors in
31:     else if  for all  then
32:         SCHEDULE(u)
33:     end if
34:end while
35:return
Algorithm 1 DCAS()

Take Fig. 1, for example. We assume that the aggregation ratio = and the total number of channels . When each sensor in the network executes the DCAS algorithm, and are set to and , respectively. In addition, all nodes in are marked as white. For sensor in the network, because that is less than or equal to for all , that is, , can execute Procedure SCHEDULE. When executes Procedure SCHEDULE, , , , , , , , because all nodes in of Fig. 5 are white. By Definition 2, we have that is the node having highest precedence in . We also have that will locally broadcast a MSGDECISION message with (, , ) to all sensors in of Fig. 5, where . In addition, , , , and are updated to , , , and , respectively. In the same way, sensor will locally broadcast a MSGDECISION message with (