Secure IoT Routing: Selective Forwarding Attacks and Trust-based Defenses in RPL Network

by   Jun Jiang, et al.

IPv6 Routing Protocol for Low Power and Lossy Networks (RPL) is an essential routing protocol to enable communications for IoT networks with low power devices. RPL uses an objective function and routing constraints to find an optimized routing path for each node in the network. However, recent research has shown that topological attacks, such as selective forwarding attacks, pose great challenges to the secure routing of IoT networks. Many conventional secure routing solutions, on the other hand, are computationally heavy to be directly applied in resource-constrained IoT networks. There is an urgent need to develop lightweight secure routing solutions for IoT networks. In this paper, we first design and implement a series of advanced selective forwarding attacks from the attack perspective, which can flexibly select the type and percentage of forwarding packets in an energy efficient way, and even bad-mouth other innocent nodes in the network. Experiment results show that the proposed attacks can maximize the attack consequences (i.e. number of dropped packets) while maintaining undetected. Moreover, we propose a lightweight trust-based defense solution to detect and eliminate malicious selective forwarding nodes from the network. The results show that the proposed defense solution can achieve high detection accuracy with very limited extra energy usage (i.e. 3.4


Integrating 6LoWPAN Security with RPL Using The Chained Secure Mode Framework

The IPv6 over Low-powered Wireless Personal Area Network (6LoWPAN) proto...

Adaptive Hybrid Heterogeneous IDS for 6LoWPAN

IPv6 over Low-powered Wireless Personal Area Networks (6LoWPAN) have gro...

Battery draining attacks against edge computing nodes in IoT networks

Many IoT devices, especially those deployed at the network edge have lim...

TEDS: A Trusted Entropy and Dempster Shafer Mechanism for Routing in Wireless Mesh Networks

Wireless Mesh Networks (WMNs) have emerged as a key technology for the n...

V'CER: Efficient Certificate Validation in Constrained Networks

We address the challenging problem of efficient trust establishment in c...

A Survey of Limitations and Enhancements of the IPv6 Routing Protocol for Low-power and Lossy Networks: A Focus on Core Operations

Driven by the special requirements of the Lowpower and Lossy Networks (L...

A Ultimate Approach of Mitigating Attacks in RPL Based Low Power Lossy Networks

The Routing Protocol for Low-Power and Lossy Networks (RPL) is the exist...

I Introduction

With the rapid adoption of Internet of Things (IoT) devices around the world [rose2015internet], many of these devices are resource-constrained [pereira2020challenges]. Since the existing Internet Protocols (IP) are too complex to be directly implemented on resource-constrained IoT devices [marques2017energy], the Internet Engineering Task Force (IETF) designs a lightweight IPv6 protocol with a series of core protocols to ensure efficient and secure communications, such as IPv6 over Low-power Wireless Personal Area Networks (6LoWPAN) [le20126lowpan] and Routing Protocol for Low Power and Lossy Networks (RPL) [winter2012rpl]. In particular, RPL, as the core routing protocol for resource constrained IoT networks, has been adopted by a variety of applications, such as healthcare [gara2015rpl], smart grid [nassar2018multiple], and smart city [junior2020dynasti], etc.

Due to its popularity, RPL becomes an attractive attack target [pongle2015survey]. One attack that can cause massive damage to the RPL network is selective forwarding attack[wallgren2013routing], where attackers interrupt network data flows by selectively dropping network packets. Compared to blackhole attacks which simply drop all packets, selective forwarding attacks are more deceptive and can remain undetected for a longer time, causing long-term damage to the network. Despite their impact, existing selective forwarding attacks are still lack of flexibility in terms of dynamically identifying victim nodes and adjusting packet forward rates according to the state of the network.

On the other hand, there are three major types of defense mechanisms against selective forwarding attacks on RPL. The first type is to build a multi-path routing network to ensure the integrity of information transmission[ma2017security]. These mechanisms often require excessive resources to maintain the backup paths for nodes. The second type is distributed defense mechanisms, which deploy a monitor module on individual nodes[airehrour2017trust]. However, these defense mechanisms often lead to significant extra energy consumption at the monitoring nodes. In addition, it is challenging to ensure that all the monitoring nodes are long-term reliable and honest in reporting their neighbors’ behaviors. The third type is the centralized defense mechanisms [raza2013svelte], where a central node is employed to monitor and analyze malicious behaviors in the network. Nevertheless, the central node has to be deployed at a core location in the network to ensure coverage across the entire network, and may be easily misled by complex bad-mouthing attacks.

In this work, we aim to advance current studies from both the attack and defense perspectives. Specifically, from the attack aspect, we propose an advanced selective forwarding attack model, which can dynamically launch three types of malicious behaviors: (1) flexibly dropping packets from selected types of protocols, (2) adjusting the packet forward rate based on the average network packet forward rate to stay stealthy, and (3) dynamically selecting specific children nodes for bad-mouthing attacks. Furthermore, these attack behaviors can be combined to significantly increase the damages to the network and reduce the risk of being detected by state-of-the-art defense mechanisms.

Furthermore, from the defense aspect, we propose a novel centralized trust-based defense mechanism to combat selective forwarding attacks in RPL networks. Compared to distributed defense mechanisms, the proposed scheme can significantly save the energy consumption for anomaly detection by deploying the defense computation only at the root node. Unlike other centralized defense schemes, the proposed scheme takes advantage of RPL’s tree-based network topology to evaluate the trustworthiness of nodes and avoids the introduction of a trusted third-party node. Furthermore, by carefully designing the detection/notification module, the proposed scheme can effectively balance the trade-off between detection delay and energy consumption.

The main contributions of this paper are as follows.

  • This work proposes an advanced selective forwarding attack with three types of behaviors against RPL network. Malicious nodes in the proposed attack model cannot only flexibly choose the type of packets to drop, but also control the packet forward rates dynamically. As a result, these attackers are able to launch more stealthy attacks to avoid being detected. Furthermore, malicious nodes can also bad-mouth other normal nodes to cause false alarms in the system. The experiment results show that it can effectively evade RPL self-defense mechanism and state-of-the-art defense mechanisms.

  • This work proposes a novel centralized trust-based defense mechanism. In particular, the proposed trust model integrates a self-trust value, which reflects a node’s trustworthiness in its packet forwarding behavior, and a tree-based descendant trust value, which takes advantage of the RPL network topology to prevent bad-mouthing attacks. Furthermore, the proposed defense mechanism is deployed on the root node, which can effectively reduce the total energy consumption caused by distributed anomaly monitoring scheme, and eliminate the security risks caused by the introduction of third-party devices. Experiment results show that the proposed scheme achieves high detection accuracy and low energy consumption.

  • This work proposes a novel anomaly report mechanism. Instead of using UDP packets, ICMPv6 control message is chosen to send information about malicious nodes. The reporting mechanism only starts when root node detects the malicious behaviors in the network. This reporting mechanism cannot only ensure that nodes in the network can be notified promptly, but also avoid causing information broadcast storms.

The rest of this paper is organized as follows. Section II discusses existing selective forwarding attacks and defense mechanisms in RPL networks. Section III introduces preliminaries of the RPL protocol. Section IV and V discuss the proposed selective forwarding attack and the defense mechanism in details. The results of the experiment are given in Section VI, followed by a conclusion in Section VII.

Ii Related Work

Ii-a Selective Forwarding Attacks in RPL Networks

RPL network faces a variety of security threats, which are mainly divided into three categories[mayzaud2016taxonomy, almusaylim2020proposing]. The first type of attack is resource attack, such as flooding attacks[le2013impacts] and increased rank attacks[xie2010routing]. In these attacks, the attacker aims to exhaust the victim node’s energy and reduce its lifetime by misleading it to execute a large number of unnecessary instructions. The second type of attack is traffic attack, such as sniffing attacks[mayzaud2016taxonomy] and identity attacks[wallgren2013routing], where the attacker’s main goal is to eavesdrop or manipulate the network’s traffic. The third type of attack is network topology attack where the attacker undermines the security and stability of the network by changing the topology of the network, such as sinkhole attacks[wallgren2013routing], blackhole attacks[raza2013svelte] and selective forwarding attack[gara2017intrusion].

Selective forwarding attack is one of most devastating type of attacks [hu2014detection, mathur2016defence, ren2016adaptive], which can cause severe damage to the network. However, in many of these attacks, the attackers set a fixed number of dropped packets or blindly attack all data packets, which increases the risk of being detected by defense mechanisms. In some advanced attacks[wallgren2013routing, airehrour2017trust], the proposed attacks interrupt the victim node’s communications by only forwarding RPL control messages while dropping all [wallgren2013routing] or partial data packets [airehrour2017trust]. Different from these existing attacks, in this study, the proposed selective forwarding attacks can perform more flexible malicious behaviors and even bad-mouth other innocent nodes to mislead state-of-the-art defense schemes.

Ii-B RPL Network Defenses

The original design of RPL protocol has some basic security schemes, such as the local and global repair mechanisms[winter2012rpl], which can be triggered by changes of network topology, e.g. a link failure. However, these basic security repair mechanisms are far from adequate to resist the rapid evolving security attacks[airehrour2016securing].

Beyond the basic repair mechanisms, there are mainly three categories of defenses. The first type is to establish multiple routing paths for each node to avoid selective forwarding attacks. In[ma2017security]

, the authors propose a secure routing protocol M-RPL, which establishes a hierarchical cluster network and backup paths for different clusters in the route discovery phase. In

[lodhi2015multiple], the authors establish a temporary backup path for a node based on its packet delivery ratio. In[jenschke2018multi], the authors propose to use the principle of Packet Replication and Elimination (PRE), through IEEE 802.15.4 Time-Slotted Channel Hopping (TSCH) as media access, to create parallel paths from nodes to the root node. Although the multi-path mechanism can effectively resist attacks, due to the introduction of redundant network routes and extra resources to maintain the backup paths, these defense schemes often cause significant increases in nodes’ energy consumption.

The second type of defenses is distributed defense mechanisms that build monitor module on each network node. Due to its easy implementation, this type of defenses is the most popular one. In [airehrour2017trust, pu2018mitigating, khan2017trust], the authors propose various distributed trust-based mechanisms, where each individual node monitors its neighbors’ incoming and forwarding traffic and calculates their trust values. In [medjek2017trust, djedjig2020trust], a distributed, collaborative and layered trust-based IDS (T-IDS) and a Metric-based RPL Trustworthiness Scheme (MRTS) are proposed respectively, where each node monitors and cooperates with its neighbors to detect and report intrusions. These distributed mechanisms, however, have to be deployed on each IoT nodes, leading to significant extra energy expenditure. On the other hand, it also presents a new challenge, as how resource-constrained nodes can provide long-term reliable and honest reports on their neighbors’ behaviors.

The third type is centralized defense mechanisms. In[raza2013svelte], the authors use a SVELTE intrusion detection system (IDS), where a 6LoWPAN Mapper is placed on the IPv6 Border Router (6BR) to monitor and analyze the malicious behaviors in the network. In[wallgren2013routing], the authors propose a lightweight heartbeat protocol, in which the root node detects the malicious node by periodically exchanging an echo signal with all its children nodes. In [ul2021ctrust], authors introduce a control layer to achieve the hierarchical trust-based mechanism “CTrust-RPL” to monitor the nodes’ behaviors. These centralized detection mechanisms are often energy-efficient since only a limited number of nodes are involved for behavior monitoring. However, since the root node cannot directly monitor each node in the network, it can be easily misled by some malicious attacks, such as bad-mouthing attacks [gautam2020efficient], where malicious nodes blame their parent/children nodes for packet dropping. As a result, it is challenging for such defense schemes to be robust against complex bad-mouthing strategies.

In our prior conference paper[jiang2018root], we have studied blackhole attacks and proposed a centralized defense mechanism at the root node. On this basis, we further study the selective forwarding attacks and defense scheme in this work. Unlike the blackhole attack model, which blindly drops all packets, the attack model in this study launches attacks with three malicious behaviors, (1) flexibly select packet protocol, (2) dynamically adjust packet forward rate, (3) bad-mouth the victim children nodes. As a result, it can effectively hide the attack behaviors and cause long-term network damage. In addition, we propose a new lightweight centralized trust-based defense scheme to defend against selective forwarding attacks. Compared to our prior work, where root node simply uses the average packet forward rate as the trust value for each node, the defense scheme proposed in this work is more comprehensive by (1) integrating self-trust value and tree-based descendant trust value, (2) introducing a beta-based trust framework with a discounting factor to gradually reduce the impact of previous behaviors, and (3) assigning asymmetric discounting factors for good/bad behaviors, so that a node’s bad behavior is remembered for a longer time. Furthermore, more comprehensive experiments are performed to evaluate the performance of the proposed attacks and defenses from different aspects.

Iii Preliminaries: RPL Protocol

As the attacks and defenses proposed in this work are based on RPL protocols, in this section, we briefly introduce some basis of RPL protocol. The Routing Protocol for LLN (RPL) is defined by the IETF’s Routing Over Low power and Lossy Networks (ROLL) Working Group. In particular, each RPL network may contain multiple RPL instances. Each RPL instance may contain multiple Destination Oriented Directed Acyclic Graph (DODAG). In a DODAG, the root node, which is usually the most powerful node, is responsible for storing and managing the routing paths. Non-root nodes can be added to one or more DODAGs.

RPL is a hierarchical-based routing protocol that relies on a DAG structure to exchange data among network nodes. Consequently, the parent and children relationship is essential for routing in a RPL network. Each DODAG has a specific objective function, which defines how each node selects its parent node. Based on the objective function, an optimal path from any leaf node to the root node can be constructed.

Iii-a Control Messages in RPL Network

There are four main types of control messages [winter2012rpl] to establish the DODAG, which are DODAG Information Object (DIO), DODAG Information Solicitation (DIS), Destination Advertisement Object (DAO) and Destination Advertisement Object Acknowledgement (DAO-ACK). Particularly, DIO is the most frequently used message, sent by the root node to all other nodes in the network, to advertise network structures for DODAG discovery, assembly and maintenance. Therefore, to rapidly report anomaly detection results while avoiding extra overhead introduced by anomaly reporting messages, we propose to insert detection results into the DIO message to distribute to all nodes in the network. The format of a DIO message is shown in Figure 1, which includes RPL InstanceID, Rank value, DODAG ID, Destination Advertisement Trigger Sequence Number (DTSN), etc. Non-root nodes must advertise and remain the values in DIO message, except for the update of the fields Rank and DTSN.

Fig. 1: Format of DIO message

Iii-B Default Security Mechanisms in RPL Networks

RPL networks can adopt standard mechanisms to ensure message integrity and confidentiality at different layers of the protocol stack. For example, the standardized IEEE 802.15.4 security, lightweight 6LoWPAN compression IPsec [raza2011securing], and Datagram TLS (DTLS) are adopted to ensure security at the data link layer, IP layer and transport layer, respectively.

In addition, RPL adopts some simple repair mechanisms to recover from three network failures, including routing topology failure, link failure and node failure. When a small number of failures happen, the local repair mechanism starts. The local repair mechanism allows the nodes, which are impacted by failures, to detach from the original DODAG and change their rank values to infinite, then re-join the DODAG again. After multiple local repair mechanisms are initiated, the RPL protocol performs a global repair to rebuild the entire DODAG network by increasing the DODAG version number. Please note that the RPL uses a trickle timer to handle inconsistencies in the RPL DODAG. When the RPL network is stable, the interval of the trickle timer will increase exponentially. When a network inconsistency is detected, such as a loop generation, the trickle timer is reset.

As briefly discussed in Section II-B, these basic defense mechanisms are far from adequate when advanced attacks are launched.

Iv Proposed Selective Forwarding Attack

Among diverse attacks against RPL networks, some attacks aim to cause as severe damage as possible to the network within a short time period, such as blackhole attacks[jiang2018root]. These attacks are often easily detected and isolated by the defense mechanisms due to the aggressiveness of the malicious nodes’ behaviors. Therefore, selective forwarding attacks, which can interrupt network communications in a flexible and stealthy way, are often launched to cause long term network damages. In this study, we propose an advanced selective forwarding attack model with three different types of selective behaviors.

Iv-1 Protocol-based Attack

We propose to selectively drop network packets according to their protocol types. Specifically, we propose to drop only data packets (i.e. non-ICMP packets) to achieve attack stealthiness. This is because in RPL networks, the loss of IPv6-based control messages (i.e. ICMP packets) will cause inconsistencies in network routing topology and trigger the RPL repair mechanisms. However, since data massages are transmitted based on UDP, the loss of such messages is difficult to be detected by RPL’s self-defense mechanisms. In such cases, the RPL’s self-recovery mechanisms will not be triggered [wallgren2013routing].

As shown in Algorithm 1, before the malicious node forwards a packet, it first determines whether the type of the packet is a data packet. If so, this packet can be dropped. Please note that, to enable selective behaviors based on protocol types, the malicious node only needs to check the header part of the messages, which will not incur significant processing power.

Input: (the Struct of IPv6 packet that need to be forwarded)
1 if  then
2      Drop packet;
3 end if
Algorithm 1 Protocol-based Attack

Iv-2 Packet Forward Rate-based Attack

We assume that the data messages transmitted between nodes are encrypted, so that the attacker can control whether the malicious nodes discard the data packet but cannot change the content of the data packet. The malicious nodes can dynamically adjust their packet forward rates (PFR) according to the network conditions. It is not easy to determine an appropriate PFR, which can cause non-trivial damages to the network while avoiding being detected. In this study, we propose to achieve this goal by controlling the PFR to be slightly above the average PFR of the network. Specifically, the malicious node estimates the average PFR by monitoring all its neighbors’ incoming and outgoing packets, and ensures that its PFR is slightly above the average PFR of the network (i.e. by a small value

). Please note this PFR will be dynamically updated based on the changes of the average network PFR. Consequently, the malicious node can hide itself while still causing long term damage to the network. The equation to calculate a node ’s PFR is shown below


where and represent the number of packets received and forwarded by node within time duration , respectively. In the proposed attack, a malicious node estimates the network average PFR (i.e. ) based on its neighbors’ PFR, as shown below


where represents the number of neighbors of the malicious node.

With a larger value of and a longer time duration , the malicious node can achieve a more accurate estimation of , which, however, will also cause extra energy consumption and time delay. The proposed attack can flexibly adjust the trade-off according to specific attack scenarios. As shown in Algorithm 2, before forwarding a packet, the malicious node determines whether to drop the packet by comparing the its current PFR with the observed network average PFR plus . The value of can be adjusted according to the aggressiveness of the attack.

Input: (the Struct of IPv6 packet that need to be forwarded), (PFR of the malicious node within duration ), (Estimated network average PFR within duration )
1 if  ( then
2       Drop this packet;
3 end if
Algorithm 2 Packet Forward Rate-based Attack

Iv-3 Bad-mouthing Attack

The proposed attack can arbitrarily choose one or multiple children nodes to achieve bad-mouthing attack. In bad-mouthing attack, the malicious node can frame the victim node (i.e. one of its children nodes) up by discarding data packets from the victim node.

For this attack, the most challenging part is to selectively choose the victim nodes and attack strategy. Since blindly selecting children nodes to attack or attacking all children nodes increase the risk of the attacker being exposed to the detection mechanism, we propose to only select specific children nodes as the victim nodes. More importantly, the attacker can flexibly choose victim nodes that are either located at critical network positions, or requiring minimum attack effort. For example, as shown in Algorithm 3, the malicious node identifies its children nodes with lower PFR as the victims as badmouthing these victims requires dropping less number of packets (i.e. less attack effort).

After identifying the ideal victim node, the malicious node can dynamically discard the victim’s packets, misleading the root node to identify the victim as a malicious node that drops packets. If multiple malicious nodes coordinately launch attack at the same time, the false alarm rate will significantly increase for most trust-based defense solutions.

Please note that although we discuss these three attack behaviors independently for the sake of clarification, these attacks can be flexibly integrated to cause more damage.

Input: (the Struct of IPv6 packet that need to be forwarded), (the Struct of malicious node), (Estimated neighbor’s average PFR), (list of children nodes), (number of victim children nodes)
1 sort in ascending order of PFR;
2 for  do
3       if  and and  then
4            Drop packet;
5             break;
6       end if
8 end for
Algorithm 3 Bad-mouthing

V Lightweight Trust-based Defense Scheme

In this section, we propose a lightweight trust-based defense scheme, which is deployed on the root-node, against selective forwarding attacks, as shown in Figure 2. The input for the defense scheme is the data packets received from non-root nodes. The defense scheme includes three major modules. The detection module analyzes the trust value of each node based on the received data packets, whose propagation path is shown by the solid black line in Figure 2. After malicious nodes (e.g. node 4 in Figure 2) are identified, the notification module encapsulates such information in DIO packets and notifies all the nodes in the network, as indicated by the brown dashed line in Figure 2. In the isolation module, children nodes of the identified malicious nodes (e.g. node 7 in Figure 2) can isolate the malicious nodes and re-select their parent nodes based on received DIO messages. For example, the changed propagation path of data packets from node 7 is shown as the blue dashed line in Figure 2.

In the rest of this section, we first introduce the design of the trust model, which is the core of the proposed defense solution. Then, we discuss each module of the proposed scheme in details.

V-a Trust Evaluation Model

An advanced trust model is designed to evaluate the anomaly of each node’s behavior. In a RPL network, the root node tracks the behaviors of each individual node and dynamically calculates the trust value, which is denoted as . The trust value falls in the range from zero to one. When a node’s trust value is below a trust threshold , it will be identified as a malicious node.

In the proposed trust model, the overall trust value of a node is composed of two parts: a self-trust value , which is to capture failures of packet forwarding, and a tree-based descendant trust value , which is to capture bad-mouthing attacks.

Fig. 2: Packet Propagation in the Proposed System

V-A1 Self-trust Value

The proposed scheme adopts the Beta trust model [josang2002beta]

as its basis. Beta distribution is a family of continuous probability distributions that are often used to model binary events. In our case, we consider whether a node can successfully forward a packet or not as a random binary event. Then based on the prior observations on the number of successful and failed packet forwarding events, the probability of this node to successfully forward the next packet can be estimated as the expected value of the beta distribution, as shown in equation (

3), where and represent the number of packets successfully sent or lost by node , respectively.


From equation (3), we can observe that a node’s self-trust value increases when more packets are successfully forwarded, or drops when more packets are lost.

However, the basic beta-based trust model cannot capture alternative behavior attacks [labraoui2015off], where malicious nodes alternatively perform good behaviors to accumulate high trust values and bad behaviors to interrupt network traffic. To prevent such attacks, we propose to introduce temporal information to discount a node’s packet forwarding behaviors performed long time ago. Specifically, we introduce a discounting factor to gradually forget a node’s behavior over time, so that a behavior with a smaller discounting factor indicates a lower influence on the node’s trust value. The calculation of the discounting factor for the behavior of a node is shown below.


where represents the total number of behaviors performed by the node so far, including both successful and failed forwarding behaviors; and represents the total number of behaviors performed after behavior . In other words, each time when a node performs a new behavior, all previous behaviors will have their value increased by 1, resulting in a smaller discounting factor. Please note that the latest behavior will always have its , leading to its discounting factor value as 1. In addition, the forgetting speed is a constant value. A larger value will result in a smaller discounting factor and thus a higher forgetting speed.

More importantly, to punish bad behaviors further, we propose to design an asymmetric trust model so that past good behaviors can be quickly forgotten while past bad behaviors will be remembered for a longer time. To achieve this goal, two different values (i.e. and ) are adopted to separately discount past good and bad behaviors respectively, where . Therefore, the self-trust value of node with behaviors (i.e. ) is calculated as follows.


In equation (5), value is 1 if behavior is good, or 0 if behavior is bad.

V-A2 Tree-based Descendant Trust Value

Due to the restricted hierarchical structure of RPL networks, only parent nodes will forward data packets for their children nodes. It is very easy for a malicious parent node to control the packet transmissions of one of its children nodes to launch bad-mouthing attacks against this child. By only considering self-trust value, the victim child node’s trust will drop while the malicious parent’s trust value remains the same. To further defeat such attacks, we propose to also introduce a tree-based descendant trust value for each node, which considers the trust value of its direct descendants. The descendant trust value is defined as follows.


where is the self-trust value of node ’s child node . Parameter represents the weight of the child node , which is determined by the number of data packet received by per time period. The greater number of data packets received by means larger weight assigned to node . In addition, denotes the total number of packets received by node .

Please note that in the proposed scheme, a node’s descendent trust value only depends on its children’s self-trust value. Since RPL is a tree-like network, if a node’s descendent trust value also considers its children’s descendent trust values, it will lead to a recursive counting, where leaf nodes’ trust values are over-emphasized. Since the descendant trust value is mainly designed to prevent bad-mouthing attacks, which can only effectively attack children nodes, we propose to not recursively count it.

When the malicious node bad mouth any of its children nodes, its own descendant trust value will be decreased. Moreover, the descendant trust value will significantly decrease if the number of the victim children nodes increases.

V-A3 Aggregated Trust Value

Finally, the aggregated trust value of the node is the combination of the self-trust value and the descendant trust value . The calculation of the aggregated trust value is shown below.


where and are the weights for self-trust value and descendant trust value. This aggregated trust value will serve as the major criteria to identify suspicious nodes in the network.

V-B Detection Module

In this section, we present the detection module, which involves the above proposed trust model as its core. Specifically, we make two assumptions. First, the data messages transmitted between nodes are encrypted, meaning that malicious nodes on the routing path cannot change the content of the transmitted data information. Second, all data messages generated by non-root nodes are transmitted through the root node to the external network (e.g. the Internet). This is a reasonable assumption for most RPL networks [wallgren2013routing, shreenivas2017intrusion].

Current RPL protocol does not support the root node to record and track the packets sent by non-root nodes. To address this challenge, we introduce a sequence number, which is stored in the first byte of the data payload sent by each node. In particular, the sequence number is increased by one each time when the source node sends out a packet. The root node estimates the packet forward rate for each source node based on the number of received data packets and the corresponding sequence numbers.

In addition, since frequently calculating the trust value for each node greatly increases the workload of the root node, we introduce a sliding time window. The root node only calculates the trust value of each node once in each sliding time window. The length of the window can be determined according to the network status, such as the battery capacity of the root node and the sensitivity of the trust value.

By calculating the trust value of each node, the root node can identify the possible malicious nodes according to Algorithm 4. Confirmed malicious nodes will be added to the “blacklist”. However, because the nodes may suffer from bad-mouthing attacks from their parent nodes, it may lead to high false alarm rate if we directly add all nodes with low trust values to the “blacklist”.

To reduce the false alarm rate caused by bad-mouthing attacks, we propose to add a “watchlist” and a trust recovery time period. When the trust value of a node is lower than the threshold for the first time, it will be added to the “watchlist” as a suspicious node. If a suspicious node is required to change its parent for further investigation, the root node will reset a recovery timer and track if the suspicious node’s trust value can recover after the parent change action. Please note that the length of the recovery timer can be determined according to the specific network status. A longer timer leads to longer detection delay but lower false alarm rate. Within the recovery time period, if the trust value of the suspicious node recovers back to the threshold, indicating that changing its parent node stops the anomaly, this node is considered as a normal node. Then its parent will be identified as the malicious node and added to the “blacklist”. Otherwise, the suspicious node is identified as a malicious node and moved to the “blacklist”.

Input: trust threshold , (The nodes in the list have changed parent.)
Output: (containing malicious node ID), (containing suspicious node ID),
1 INITIAL: = empty, = empty, = constant value, = constant value;
2 while  expires do
3       foreach  do
4             if   then
5                   if node not in  then
6                         .add();
7                  else
8                         if  = True and expires then
9                               .remove.();
10                               .add(); = False;
11                         end if
13                   end if
15            else
16                   if  = True and node in  then
17                         .add(.old_parent);
18                         .remove();
19                         = False;
20                   end if
22             end if
24       end foreach
26 end while
Algorithm 4 Malicious nodes detection

V-C Notification Module

After a malicious node is identified, the root node needs a reliable way to notify all the children nodes of the malicious node while avoiding information storms. This is challenging since the RPL network follows a strict tree-like topology for data packet forwarding, and a node can only receive data packets from its parent. It means that if data packets are used to disseminate the notifications, these packets will be simply dropped by malicious nodes and never reach their children nodes.

In this work, we propose to use the control messages (i.e. ICMPv6 messages) to disseminate these notifications, which can reach the children of a malicious node through other neighbor nodes. In particular, we recommend using the first byte in the payload of the ICMPv6 control messages to store the node ID, as shown in Figure 3. The first bit can be used to distinguish the suspicious node, which is set to 0, and identified malicious node, which is set to 1.

Fig. 3: The proposed notification message

In order to reduce the energy consumption of the network, the root node executes the notification module right after the detection module. As shown in Algorithm 5, there are two lists: a “blacklist” and a “ParentChangingList”. The root node marks the node information in the two lists separately. The “blacklist” is used to notify all the children nodes, whose parents are identified malicious nodes in the “blacklist”, to re-select their parent node. The suspicious nodes, which are required to change parents, are added into the “ParentChangingList”. This distinction can avoid unnecessary parent selection process for non-relevant nodes in the network.

In addition, the root node finds sub-trees, which can cover minimum number of nodes in the “watchlist”, based on RPL tree-like topology. Then, the root node adds the nodes with the highest rank value in each sub-tree to “ParentChangingList”, which is the function “FindMaxRankNode” in Algorithm 5.

Input: ,
1 INITIAL: = empty, = False;
2 while ( not empty or not empty) and  do
3       if  not empty then
4             .add
5            (FindMaxRankNode());
6             broadcast();
7             foreach  do
8                   = True;
9             end foreach
11       end if
12      if  not empty then
13             broadcast();
14       end if
16 end while
Algorithm 5 Malicious nodes notification

V-D Isolation Module

As shown in Algorithm 6, when a non-root node in the network receives the notification message, it checks whether the node information in notification is its parent or itself. If any of two cases is true, this node removes the current parent from its parent list. Then, it re-selects its preferred parent and broadcasts this ICMPv6 control message to its neighbors. If not, the node broadcasts the ICMPv6 control message directly to all its neighbors.

Input: (received ICMP6 control message which contains malicious node ID or suspicious node ), Parent ID , Node ID
1 if . = or . =  then
2      RplRemoveParent();
3       PreferredParent = RplSelectParent();
4       broadcast();
6      broadcast();
7 end if
Algorithm 6 Isolation malicious nodes

After these three modules are completed, the malicious nodes launching selective forwarding attacks will be abandoned and isolated from the network.

Vi Experiment and Result Analysis

Vi-a Experiment Set Up

This work adopts Cooja, which is a network simulator of Contiki OS[zikria2018survey], as our experimental platform. Specifically, fifteen nodes are randomly deployed in the experiments as shown in Figure 4, including a root node (i.e. node 1), eleven legitimate non-root nodes (i.e. node 2 to node 11) and three malicious nodes (i.e. node 12 to node 14). Minimum Rank with Hysteresis Objective Function (MRHOF) is selected as the objective function, where children nodes select their preferred parents according to ranks and Expected Transmission Count (ETX) values. The simulation parameters are summarized in TABLE I.

Fig. 4: Example of RPL Network Topology

Based on the experiments, the performance of the proposed attack and defense scheme are tested and then compared with other state-of-the-art works. Specifically, this work adopts receiver operating characteristic (ROC) curve as the major performance metric because it can effectively reflect the trade-off between the detection rate and the false alarm rate when different thresholds are adopted. In each ROC curve, the x-axis and y-axis represent the detection rate and false alarm rate, respectively. The area under ROC curve (AUC) represents the accuracy. Larger area under the curve (i.e. the higher the AUC value) indicates better performance.

Vi-B Performance of Proposed Attacks

In this experiment, malicious nodes may either selectively drop the victim node’s data packets based on their perceived average network PFR, or launch bad-mouthing attacks against a specific child node. To illustrate the impact of different attacks, the ratio of bad-mouthing attacks to the total number of attacks in the network is divided into four cases, 25%, 50%, 75% and 100%.

The performance of the proposed attacks is evaluated against three defense schemes. In the first two schemes, the root node identifies the malicious nodes by comparing the PFR of each non-root node with a threshold value. The nodes with lower PFR are identified as malicious. The difference is that the first scheme (i.e. avg scheme) uses a node’s average PRF, while the second scheme (i.e. rec scheme) uses only a node’s most recent PFR. In the third scheme (i.e. def scheme), since the default RPL security scheme allows a node to re-select its parent node when network failures (e.g. link failure and node failure) are detected, the root node identifies malicious nodes by checking whether a node is discarded by its children.

Parameter Value
Simulation platform Contiki/Cooja 3.0
Transport UDP/IPv6
Emulated nodes Z1 mote
Simulation coverage area 130 m * 130 m
Total number of nodes 15
Malicious nodes 3
TX range 50 m
Interference range 100 m
Packet size 46-byte
Data packet period 60 seconds
Routing protocol RPL
Network protocol IP based
Simulation time 150 minutes
Link failure model UDGM with distance
TABLE I: Simulation Parameters

The effectiveness of proposed attack model is illustrated in Figure 5. In all sub-figures of Figure 5, the def scheme shows the lowest performance with 0.44 average AUC. This is because the proposed attacks only drop data packets, which rarely cause failures in network routing. Although avg scheme and rec scheme show slightly better performances (with an average AUC as 0.70 and 0.65 respectively in Figure 5 (a)-(b)), their performances significantly drop (with an average AUC as 0.36) when the proportion of bad-mouthing attacks increases to above 75%, as shown in Figure 5 (c)-(d). This is because these two schemes consider a node with low PFR (i.e. either average PFR or the most recent PFR) as a malicious node, which can be taken advantage by bad-mouthing attacks to frame up the victim nodes.

(a) 25% bad mouthing attacks
(b) 50% bad mouthing attacks
(c) 75% bad mouthing attacks
(d) 100% bad mouthing attacks
Fig. 5: Performance of the proposed attack model against three defense schemes. and represents avg scheme, def scheme and rec scheme, respectively.

Vi-C Performance of Proposed Defense Modules

In this sub-section, we evaluate the effectiveness of each critical strategy proposed for the defense scheme. In particular, these strategies include (1) discounting factor, (2) asymmetric forgetting speed, and (3) integration of self-trust and descendant trust. Furthermore, the proposed attack models are launched with 50% selective forwarding behaviors and 50% bad-mouthing behaviors.

Vi-C1 Effectiveness of Discounting Factor

In this subsection, and are set as the same value . By changing the values of , Figure 6 illustrates its impact on the performance of the proposed defense scheme. Specifically, four different values (i.e. 0, 0.3, 0.8, 100) are applied so that the discounting factor ranges in the interval . From Figure 6, it can be observed that the defense scheme shows the best performance (e.g. AUC = 0.630) when .

Fig. 6: Effectiveness of various forgetting speeds

Specifically, when , , which is approximately 0 for any . It indicates that the defense scheme only remembers the most recent behavior for trust evaluation. In such settings, the defense scheme can mistakenly identify normal nodes with accidental packet losses as malicious nodes and therefore results in high false alarm rates, as shown by the green dotted curve in Figure 6. On the other hand, when the value of is 0, the defense scheme, which remembers all previous behaviors, also performs worse, as shown by the blue dash dot line in Figure 6. This is because malicious nodes can easily mislead the defense scheme by accumulating high trust values through good behaviors performed long time ago.

With an appropriate value, the proposed defense scheme can achieve high performance. Particularly, as shown in Figure 5 (b), when , the AUC of the scheme with is 4% and 11% higher than that of the avg and rec schemes respectively, validating the effectiveness of the proposed defense scheme.

Vi-C2 Effectiveness of Asymmetric Forgetting Speeds

Next, we evaluate the effectiveness of asymmetric forgetting speeds in Figure 7. Observed from Figure 6, the defense scheme with shows the best performance. Therefore, in this subsection, the value range of is set from 0 to 0.4. As we propose to forget bad behavior slower, the values are set to be 0.1, 0.2, 0.3 and 0.4 higher than . The results are shown in Figure 7.

(a) offset = 0.1
(b) offset = 0.2
(c) offset = 0.3
(d) offset = 0.4
Fig. 7: Effectiveness of asymmetric forgetting speeds. Offset represents the difference between forgetting speeds and .

Comparing the schemes in Figure 6 and Figure 7, it can be observed that 79% of the schemes with asymmetric forgetting speeds in Figure 7 perform better than the best scheme (i.e. ) in Figure 6. In particular, the AUC of the best scheme (i.e. and ) in Figure 7 (b) is 20% higher than that of the best scheme in Figure 6. This observation validates the effectiveness of the asymmetric forgetting speed.

Furthermore, an appropriate offset (e.g. 0.2) between the two forgetting speeds makes the scheme perform better. Specifically, the scheme with and in Figure 7 (b) performs the best with the AUC as 0.758, which is 13%, 0.5%, and 3% higher than the highest AUCs in Figure 7 (a), (c), and (d). When the offset is too large, the scheme will be too sensitive to bad behaviors, which may increase the false alarm rate. When the offset is too small, the performance will be very similar to that of the scheme with identical and , which is less effective.

Vi-C3 Effectiveness of Integrating Self-Trust and Descendant Trust

To detect bad-mouthing attacks, where the parent node frames its children nodes by discarding their data packets, we propose to integrate self-trust () and descendant trust (). This subsection aims to evaluate the trade-off by adjusting the weights of these two aspects. In particular, and represent the weights assigned to and , respectively, where the sum of and equals one. The values of and are in the range of with the step interval as 0.1. The values of and are fixed as 0 and 0.2 (i.e. the optimal values from Figure 7), respectively.

From Figure 8, the proposed defense scheme with and achieves the best performance. Furthermore, when and , the AUC of the schemes are higher than 0.746, which validates the robustness of the proposed defense scheme.

Fig. 8: Comparison of effectiveness of combination of two trust factors

More importantly, the AUC of the best scheme in Figure 8 is 26% and 34% higher than the avg scheme and rec scheme in Figure 5 respectively. This observation validates that the introduction of descendant trust value () enables the proposed defense scheme to effectively detect bad-mouthing attacks in the network.

Some of the schemes in Figure 8 have an inflection point in the detection rate range of , indicating that the increment of detection rate slows down. This is because this experiment places a limited number of nodes, including three malicious nodes, to prevent exceeding the capacity of the simulation platform (e.g. memory overflow).

Vi-D Overall Performance Comparison

This section compares the proposed defense scheme with the state-of-the-art defense schemes, including: (1) the RPL default recovering scheme: MRHOF [winter2012rpl], (2) a centralized scheme: Heartbeat protocol (HP) [wallgren2013routing], and (3) a distributed scheme: Trust-Aware RPL Routing Protocol (TPRP) [airehrour2017trust]. All the schemes are applied on the same network topology and settings. In addition, the proposed defense scheme with , , and is launched.

Vi-D1 Detection Accuracy

As shown in Figure 9, the proposed scheme yields the best overall performance among four defense schemes. Its AUC is 76%, 48% and 10%, higher than that of MRHOF, HP and TPRP, respectively.

In particular, the HP and MRHOF schemes fail to differentiate malicious nodes from normal ones, because these two schemes detect malicious nodes based on replies of the control messages and thus cannot effectively capture the loss of data packets. Furthermore, when the false alarm rate is lower than 0.4, the detection rate of TPRP scheme stays high. This is because the TPRP scheme can effectively defend against bad-mouthing attacks by requiring each node in the network to monitor the sending and receiving packets of its neighbor nodes. However, the detection rate of TPRP scheme cannot be significantly improved. This is because it adopts average PFR as the trust value, which cannot effectively capture attacks alternatively perform good and bad behaviors.

Fig. 9: Comparison of performance of different scheme based on proposed attack model

Vi-D2 Energy Consumption and Detection Delay

Figure 10 compares the detection delay and energy consumption of legitimate non-root nodes for different defense schemes. Specifically, the proposed attacks are launched against these defense schemes. Each defense scheme has two bars, representing its energy consumption and detection delay respectively.

Fig. 10: Power consumption and detection delay comparison based on proposed attack mode

From Figure 10, we can observe that the proposed defense scheme consumes very limited power, only 3.4% more than the default RPL recovery scheme (i.e. MRHOF). This is because the proposed scheme adopts the centralized design, where non-root nodes do not need to monitor neighbor nodes’ activities, but only forward notification packets on detected malicious nodes.

On the other hand, the HP and TPRP schemes are causing extra 50% and 957% power respectively when compared to the default MRHOF scheme. Although the HP scheme also adopts a centralized defense mechanism, it frequently launches a “request and reply” process between the root node and non-root nodes, which increases power consumption. Furthermore, the power consumption of TPRP scheme is the highest due to its distributed design, which requires each non-root node in the network to monitor, analyze, and share the activities of its neighbors. These requirements significantly increase extra work time and computational costs for each node, leading to much higher power consumption.

The detection delay is calculated based on only the successful detection of each scheme. In other words, if a detection scheme can only detect two malicious nodes out of three, the detection delay is the average delay of the two successful detection. As shown in Figure 10, the MRHOF and HP schemes show small detection delay, which, however, is calculated based on the very limited malicious nodes that can be detected. Furthermore, there is a relatively large variation in the detection delay of the HP scheme. This is because the HP scheme relies on the exchange of “request-reply” messages to detect anomaly. The detection delay may vary based on the frequency of the request messages. A higher frequency may lead to smaller detection delay, but higher power consumption.

In addition, the proposed scheme yields similar detection delay as the TPRP scheme, but a larger variation. This is because in the TPRP scheme, rather than relying on notifications from the root node, each node directly monitors its neighbors’ behaviors, resulting in a relatively stable delay. The proposed scheme, however, can only detect anomaly when the data packets arriving at the root node show abnormal patterns, which may vary according to the network’s data rate and the source node’s network location.

In summary, compared to the state-of-the-art defense schemes, the proposed scheme yields much higher detection accuracy (i.e. 10% higher than the second highest one) and lower energy consumption (i.e. 31% lower than that of the second lowest one). Although its detection delay is higher than other schemes (i.e. the MRHOF and HP schemes), it is practical to be applied in a low power low data rate RPL network.

Vii Conclusion

In RPL networks, malicious nodes can damage routing paths by selectively dropping packets. In this paper, we propose an advanced selective forwarding attack with three flexible attack behaviors, including protocol selection, packet forward rate selection and bad-mouthing with children nodes selection. The flexibility enables the malicious nodes to hide their attack behaviors and maximize the long term attack impact. Furthermore, we propose a new centralized trust-based defense scheme, which consists of a self-trust based on the beta trust model with asymmetric forgetting speeds, and a descendant trust value based on the RPL tree-like topology. Experimental results show that compared to the state-of-the-art defense solutions, the proposed defense scheme can effectively detect advanced proposed attacks with very limited energy consumption.