SADAN: Scalable Adversary Detection in Autonomous Networks

10/11/2019 ∙ by Tigist Abera, et al. ∙ 0

Autonomous collaborative networks of devices are emerging in numerous domains, such as self-driving cars, smart factories and critical infrastructure, generally referred to as IoT. Their autonomy and self-organization makes them especially vulnerable to attacks. Thus, such networks need a dependable mechanism to detect and identify attackers and enable appropriate reactions. However, current mechanisms to identify adversaries either require a trusted central entity or scale poorly. In this paper, we present SADAN, the first scheme to efficiently identify malicious devices within large networks of collaborating entities. SADAN is designed to function in truly autonomous environments, i.e., without a central trusted entity. Our scheme combines random elections with strong but potentially expensive integrity validation schemes providing a highly scalable solution supporting very large networks with tens of thousands of devices. SADAN is designed as a flexible scheme with interchangeable components, making it adaptable to a wide range of scenarios and use cases. We implemented an instance of SADAN for an automotive use case and simulated it on large-scale networks. Our results show that SADAN scales very efficiently for large networks, and thus enables novel use cases in such environments. Further, we provide an extensive evaluation of key parameters allowing to adapt SADAN to many scenarios.



There are no comments yet.


page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The relentless trend towards network-connection of appliances and devices in all spheres of our lives allows systems to collaborate, becoming more efficient and enabling completely new use cases; this is often referred to as the Internet of Things (IoT). In particular, the increase in connectivity of devices that can affect the physical world brings new opportunities and, at the same time, many new risks. While some domains, like smart homes where many household devices, e.g., light bulbs [83] and refrigerators [76], become network-connected for the sake of convenience, are seemingly uncritical other domains are obviously highly critical. Various industries are motivated by higher efficiency and increased flexibility, which can be achieved by connecting devices within individual factories as well as by interconnecting facilities collaborating within a supply-chain [86]. Other industry branches, like the automotive industry and associated industries, strive to increase safety through connection and collaboration, e.g., cars sharing information about potential hazards [51]. In these high-stake scenarios malicious devices can cause tremendous damage and even jeopardize human life.

Traditional security solutions, relying on a central authority, are not applicable to these new systems, as they requires the entire system to have continuous and reliable connectivity to the central authority. This is hard to guarantee in many practical systems, e.g., with freely moving nodes. When multiple (mutually distrusting) stakeholders are involved, it is difficult to jointly agree on a party that acts as the trusted authority. For instance, different car manufacturers or cellular network equipment providers, which in many cases do not inherently trust each other, will not easily agree on an overarching authority with the power to control all devices.


In order to achieve the desired improvements that the increased connectivity promises, the connected devices need to collaborate and share information in a broadly autonomous fashion. However, interdependencies within the network increase the threat malicious devices pose to the entire system. In particular, a single malicious device could cause other devices to deviate from the correct behavior; e.g., influencing the routing of other cars by transmitting false traffic information [87]. Hence, such large networks must identify faulty or malicious entities in order to react to attacks and prevent a partial compromise of the network from impairing the correct function of the overall system.

Increasingly, network-connected devices, including modern vehicles [59, 23, 70], industrial facilities [35, 25, 19], critical infrastructure [34, 52, 79], and even medical devices [80] are targeted by (remote) software attacks [81, 24, 43, 22, 36, 28, 85].

Existing defense strategies.

Traditional security solutions [3, 62, 64, 26, 78, 53, 96, 75, 42, 27, 14] typically cannot recover an attacked device, i.e., it usually crashes with potentially catastrophic consequences.

Attack detection methods can uncover ongoing attacks, enabling more sophisticated reaction policies, like recovery of a compromised device [67]. Approaches like Intrusion Detection Systems (IDS) [65, 44, 88, 82]

or outlier detection as used in Wireless Sensor Networks (WSN) 

[98, 15] suffer from inaccuracies and rely on assumptions about the attack. However, to be able to detect sophisticated attacks like code-reuse attacks [81, 24, 43, 22, 36, 28, 85], powerful security services, like remote control-flow attestation, are required [4, 5].

To be able to leverage powerful integrity validation schemes, like remote attestation, in large networks of collaborating devices a number of challenges have to be tackled. Enabling the overall system to tolerate and handle compromised devices, a dependable global view or consensus among all devices in the system must be found. This means all devices in the network must be informed about the state of the other devices. The simplest approach to achieve this is all-to-all attestation; however, this approach does not scale for large systems. Approaches to attest multiple devices collectively, also known as swarm attestation, provide an integrity proof of the network to a single verifier [13, 6, 45, 20], rendering them inapplicable in autonomous systems without central authority.

For an autonomous decentralized system to detect, identify and react to attacks, all attestations would need to be distributed and verified across the entire network. While consensus protocols, e.g., Byzantine fault tolerance [21], in general enable such agreement they are only applicable to (very) small systems. Even the most efficient consensus protocols do not scale to systems with thousands or millions of devices.

Goals and Contributions.

To overcome the limitations of existing defense approaches we developed a novel scheme to identify compromised members within a network, called Scalable Adversary Detection in Autonomous Networks (SADAN). To achieve this, the devices in the network monitor one another; however, no single device in the network is trusted to decide whether or not another device is compromised. In fact, if a single device could denounce another device as compromised this could be easily used by an attacker-controlled device to destabilize the entire network by falsely accusing benign devices. We are the first to combine random elections with Byzantine fault tolerant consensus to efficiently and conclusively determine in a highly scalable manner whether an accused device is compromised, or if a device is making false accusations. This allows the system to quickly identify compromised devices and react, e.g., by excluding the compromised device from the network.


Our main contributions include:

  • We present SADAN, the first efficient and dependable scheme to identify malicious devices within very large networks of collaborating entities. SADAN is:

    • The first to combine random election, Byzantine fault tolerant consensus and attestation to allow a partially compromised system to detect and identify its compromised parts (Section 4).

    • A flexible scheme with pluggable components that supports various integrity validation schemes (e.g., different attestation schemes), random election schemes as well as different consensus schemes (Section 5).

  • We implemented a highly efficient instantiation of SADAN based on Proof-of-Elapsed-Time [1], Practical Byzantine Fault Tolerance (PBFT[21] and run-time attestation [5] (Section 6).

  • We evaluate SADAN’s security and we determine the best choice for key parameters (Section 7). Further, we developed a large-scale network simulation for SADAN with tens of thousands of devices and demonstrate its scalability through extensive evaluation (Section 8).

2 Background

In this section we provide background on the core mechanisms used in our solution.

2.1 Remote Attestation

Remote attestation is a security primitive that enables a remote party—called verifier—to validate the state of a device–called prover. Remote attestation is typically realized as a challenge-response protocol allowing a verifier to obtain a fresh and authentic report about the prover’s software or hardware state. Early remote attestation approaches were limited to the static software state of a device, i.e., a cryptographic hash calculated over the binary code loaded into memory [33, 38, 84]. This allows the verifier to detect modifications of a device’s software binaries, e.g., due to malware infection. More sophisticated approaches capture the run-time behavior of the prover’s device [4, 30, 29, 9, 97, 5], which allows the verifier to detect code-reuse attacks like Return-Oriented Programming (ROP[81, 22, 36, 17, 28] and even non-control data attacks [24, 43].

In order for an attestation to be unforgeable by the adversary, the prover is typically assumed to have a trust anchor that is trusted by the verifier. A prominent instantiation of a trust anchor is the Trusted Platform Module (TPM[94], which is a dedicated hardware-secured microprocessor designed for remote attestation [93]. The TPM can securely store the computer’s state to be reported as well as cryptographic keys necessary to authenticate the remote attestation report. Remote attestation functionality can also be provided by a Trusted Execution Environment (TEE[91, 55, 16]. A TEE is an isolated execution environment that provides security features such as isolated execution, integrity of applications running in the TEE and confidentiality. Examples of commonly available TEE implementations are ARM TrustZone [10] for ARM-based platforms and SGX [68, 8, 47] from Intel.

2.2 Byzantine Fault Tolerance

Byzantine Fault Tolerance (BFT) algorithms solve the Byzantine Generals Problem [63] of finding consensus among parties where some might be faulty or adversarial, i.e., act Byzantine. Many BFT works consider the use case of a distributed system where nodes are connected via a network. Nodes collectively perform operations in response to requests sent by external clients. There are two important properties for every BFT algorithm. (1) safety: requires that all operations are executed identically—i.e., identical requests in an identical order—from the perspective of all nodes [60]. (2) liveness: requires that every request will eventually complete [21]. Solutions fulfilling only one of these two properties are trivial; however, to achieve BFT both need to be considered, requiring at least total nodes to endure Byzantine nodes [63].

PBFT. The first practical solution, guaranteeing both safety and liveness, was introduced with the PBFT algorithm [21]. PBFT is the de-facto baseline in the BFT literature, and has been modified and extended in various ways [60, 31, 66].

PBFT handles the ordering, which is important for BFT, by selecting a so-called primary among all nodes, chosen in round-robin fashion, which decides on the order of all requests. If the other nodes notice that the primary acts slowly or inconsistently, a “view-change” is executed, replacing the Byzantine primary with another node. In normal operation PBFT requires two rounds of all-to-all broadcasts, implying a message complexity of , which limits scalability.

Figure 1: An exemplary round of SADAN with one adversary (red) and three jurors (green).

3 System Model

We consider large distributed autonomous systems formed of networks of connected devices. These devices collaborate with one another to perform complex tasks. For instance, autonomous cars that exchange information with other cars and traffic facilities can achieve safer and more efficient mobility. In order to collaborate by coordinating their actions, the individual entities of the overall system need to exchange information, such as status updates and sensor readings. However, such exchanged information is often critical for the correct behavior of the overall system. In autonomous traffic, for example, cars report their position and trajectories to each other to predict (and avoid) collisions, and false information can have catastrophic consequences.

We consider systems were all devices can—directly or indirectly—communicate with each other, even in the presents of malicious nodes within the network. This can be realized through various network technologies, e.g., meshed networks with robust routing[41, 69], or upcoming technologies like 5G[2] and satellite-based networks[90, 73] where malicious network-clients have very limited means to disturb the network communication of other nodes111The adversary is limited to disturbing the communication of nodes (with other nodes, base-stations or satellites) within physical proximity via jamming..

All devices are mutually distrusting and there is no trusted central verifier or external coordinating operator on which the network can rely. While, SADAN generally does not require any security framework or security hardware, we present an instance that utilizes TEEs for random election and sophisticated integrity validation in Section 6.

3.1 Adversary Model

We assume an adversary who has compromised a subset of devices in the system and is able to coordinate them. The size of the adversarial subset our scheme can endure is adjustable based on different parameters, which can be chosen accordingly for the corresponding scenario. We describe and extensively evaluate these parameters in Section 7.2. Compromising new devices takes non-negligible time for the adversary222Assuming basic security like memory layout randomization, exploiting devices requires many attempts [78, 53, 96, 75, 42, 27, 14]. In heterogeneous networks the adversary needs to develop new exploits to compromise devices.. We assume that the adversary cannot compromise additional devices while the network performs a round of SADAN. The adversary’s goal is to influence the collaboration between honest nodes by manipulating the data sent to other devices. We further assume that the adversary can eavesdrop and manipulate messages between devices. However, the adversary can control only a subset of all network links. We consider full adversarial control over the network out of scope, as this would allow the adversary to bring down the entire system independent of SADAN. Furthermore, devices that participate in denial-of-service (DoS) attacks are considered malicious in our system333As a result those devices will be handled by the recovery mechanism, e.g., by expelling them..

We inherit the security guarantees and assumptions of the components used by SADAN. For instance, when using remote attestation as the integrity validation scheme their assumptions also apply to SADAN, i.e., the trust anchor is secure and physical attacks are out of scope. Similarly, SADAN inherits the detection capabilities of the used components, for instance, depending on the integrity validation scheme, different types of software attacks can be detected, including code injection [37], code-reuse attacks [81, 22, 17, 28] or non-control data attacks [24, 43]. We discuss several variants in Section 5.1.

3.2 Requirements

A scalable and flexible adversary detection scheme for collaborative autonomous networks shall fulfill the following properties:

  • Adversary Detection and Identification: An adversary actively trying to manipulate the scheme shall not go unnoticed and shall be identified.

  • Efficiency: The scheme is significantly more efficient than verifying each device-to-device pair individually.

  • Scalability: The scheme scales well to large number of devices. The computational effort and communication complexity grows sub-linear with respect to the number of devices.

  • Interchangeable Components: The scheme’s individual components have clearly separated roles and objectives, making them easily replaceable, i.e., pluggable.

4 Sadan Design

SADAN provides a scalable solution to deal with adversaries in truly autonomous networks, i.e., without external supervision from a central entity. It works in three steps:

  • A node announces a potential adversarial node,

  • the potential malicious node is then verified by a randomly elected jury acting on behalf of the whole network

  • to agree on whether or not the suspected node is malicious representatively for the whole network444The jury can also decide the action to take on the malicious node..

Figure 1 shows an example and illustrates the high-level working of SADAN.The network consists of many devices, the first five are denoted , …, , the rest of the network is condensed for brevity. First, notices some suspicious behavior of the adversary . will then announce as suspicious to the whole network. Next, the network randomly elects, in this case three, jury members . The elected jurors then individually validate to verify the claim of and find a consensus about the integrity of as well as the decision how the network shall react. Finally, the jury decision is announced to the whole network. This example solely illustrates one round of SADAN, i.e., one processed suspicion.

We introduce the concept of blame. In SADAN all nodes individually look for suspicious behavior of potential adversaries, suspiciously acting node are then examined thoroughly. We achieve this by giving each node the ability to blame another node, i.e., announcing to the whole network that the other node acts suspiciously and may be adversarial. This way, we do not verify the whole network.

Once a node is blamed, the network has to reach a decision about it. In a naive solution, each node in the network would need to individually verify the blamed node, potentially inducing a significant overhead on the whole network. Further, for a sustainable autonomous network it is important to have a consistent view across the network. Thus, a form of consensus is needed. However, consensus protocols do not scale, as explained in Section 2.2.

To avoid these scalability issues, SADAN randomly elects a jury that representatively makes a decision for the whole network. While this improves scalability, it comes at the expense of the safety property. As the election of the jury is random, there is a chance that enough adversaries are elected that they can enforce an adversarial decision within the jury. However, we can adjust the consensus so that it stalls rather than fails, as stalling can be rectified by a re-election. In Section 7.2

we will analyze how these probabilities behave regarding

SADAN’s configurable parameters. We will show that these parameters can be chosen, so that the probability of electing an adversarial jury are negligible.

SADAN is designed to be modular, hence, individual components have to be selected. Selecting these interchangeable components depends on the targeted use case and its requirements. However, each component has requirements independent of use cases that we will discuss in the following. In Section 5 will thoroughly discuss possible instances for each component and we describe a concrete instantiation in Section 6.

Integrity Validation Scheme.

Both for noticing the initial suspicious behavior of a potential adversary and the subsequent validation by the jurors, an integrity validation scheme is required. Such a scheme shall fulfill the following requirements:

  • Correctness: The scheme correctly conveys evidence representing the state of the prover to the verifier, even if the prover device is under the control of the adversary.

  • Freshness: The result of an individual validation shall only be legitimate for the respective request, e.g., past results shall not allow an adversary to illegitimately pass subsequent validations.

  • Immediate: The scheme shall work directly between two devices without the need for any third party.

  • Lightweight: The scheme shall be executable between devices with limited computational resources.

Furthermore, it is also possible to use two distinct validation schemes for different phases of SADAN. As shown in Figure 1, there is the initial validation raising the suspicion, which can be done with one scheme (e.g., a lightweight and superficial scheme). Then the validation used by the jury can be a different scheme (e.g., a thorough and complex scheme).

Random Jury Election.

After a node was blamed, the network has to randomly elect a jury. A scheme accomplishing this has the following requirements:

  • Verifiable: The randomness used in the scheme is verifiable by all nodes.

  • Fairness: Every node has the same chance of winning the election.


After all jurors performed their individual integrity validation of the blamed node, they need a consensus scheme to agree on the result and the reaction to it. The following requirements shall be met by such a scheme:

  • Safety: Assuming an honest quorum, the consensus itself ensures consistency, including the order of processed blames across all honest nodes.

  • Liveness: The scheme shall eventually make progress on all blames.

Multi-Round Sadan.

Multiple rounds are necessary to identify nodes that try to exploit the SADAN protocol. SADAN automatically triggers blames of nodes that abused the protocol, e.g., by falsely blaming a benign node or dissenting with the jury. Hence, misbehaving nodes that interfere with SADAN are uncovered. More specifically, an adversarial node may blame an honest node multiple times to increase the chances of electing enough accomplices to successfully seize the jury, or simply try to use the blaming mechanism to overload the system with requests. Therefore, an unsuccessful blame will lead to an automatic blame of the blamer by the jury of the current round. Further, clear violations in the underlying components may trigger automatic blames as well. For example, when the validation process is deterministic, correct jurors can safely blame a juror that reaches a different conclusion from the same data. These automatic blaming approaches will prevent the adversarial nodes to turn the chances in their favor over time, as attempts to manipulate the protocol will in turn risk getting blamed themselves.

Additionally, efficiency of SADAN can be improved by streamlining the election process itself over multiple rounds. It may be beneficial to skip re-electing the jury in every round. For example, it might suffice to do a re-election every ten rounds and keep the same jury in between. However, the consensus may fail, either due to too many conflicting nodes in the jury or non-security issues, like connection problems. In such a case, a re-election is triggered. This strategy reduces the overall overhead for the random jury election over multiple rounds, as we will show in Section 8.4.

5 Sadan Design Space

In Section 4 we introduced the individual components of SADAN and discussed their requirements. This section provides an overview of possible options for each of the components (cf. requirement Interchangeable Components in Section 3.2), including those we use for our instantiation (see Section 6). We consider these options in the context of our aimed at automotive use case and system model (see Section 3); thus, we do not claim the following as complete.

5.1 Integrity Validation Scheme

Integrity validation of a device can be done in very different ways, for instance, using dedicated validation functionality like remote attestation, or inferring device integrity by indirectly observing the device’s behavior.

5.1.1 Attestation

To validate the integrity of a device, remotely attesting its software can provide strong security guarantees, as it enables nodes to directly prove that their software is not altered. Depending on the used attestation schemes, different classes of software attacks can be detected. In this respect, there are two possible attestation approaches, static attestation and dynamic/run-time attestation, as described in Section 2.1. Static attestation is efficient and ensures the integrity of the binary program555Static attestation provides similar guarantees as secure boot [11]. However, it does not ensure integrity of program execution, which can be compromised by run-time attacks. Dynamic (run-time) attestation schemes, such as control-flow attestation, ensures run-time integrity by attesting a device’s code execution paths [4, 30, 29, 97, 5]. Depending on the required security guarantees SADAN can employ both static as well as dynamic attestation.

5.1.2 Sensor Data Outlier Detection

In Wireless Sensor Networks the prevalent method of validation is the unsupervised outlier detection on sensor data [98, 15]. Outliers are measurements that significantly deviate from the normal pattern of sensed data. The source of these outliers are either due to noise, unusual events or malicious attacks [98]. Regardless of source, certain approaches can be used to reliably detect them, such as statistical techniques, classification algorithms or techniques designed for specific types of sensor data [98].

5.1.3 Anomaly Detection

One way to detect adversarial behavior is anomaly detection based on machine learning. With this approach a model is trained reflecting the nominal operation in the network. If an adversary misbehaves it will produce outliers in this model, which will be detected. This has been successfully applied to TCP/IP traffic

[89]. Here, the model is trained on the metadata found in network frame headers, such as IP addresses, packet sizes and session data. Another work combines this technique with federated learning for IoT devices [72]. This shift from a central entity handling the training to a distributed approach, makes these systems applicable to non-centralized systems.

5.2 Random Jury Election

Schemes for secure random elections can be found in the blockchain space. Many cryptocurrencies employ a scheme based on the Proof-of-Work to randomly elect the proposer for the next block [71, 56, 18]. Instead of determining a new block proposer SADAN can use these schemes to elect its jury. In these schemes the voting power is directly tied to computing power to prevent Sybil attacks [32], i.e., a node assuming multiple identities to unfairly increase its influence. The security of these schemes is based on game-theoretical arguments tied to incentives, i.e., an adversary will only attack if the monetary investment is worth it. While requiring inherent value limits the applicable use cases, the following two schemes from the blockchain space are not directly tied to incentives.

5.2.1 Algorand

In Algorand [40] a delegation group is randomly elected to propose new blocks. Here, each node draws a number based on the Verifiable Randomness Function (VRF). The lowest numbers win the election, and thus the delegation group is elected. The VRF works as a deterministic source of randomness and as such is publicly verifiable. Put simply, each node’s individual random number is the hash of the concatenation of its identity, i.e., its public key, and the last block’s hash. This results in a random number that is verifiable by all participants, as only public information is necessary to calculate it. However, Algorand also needs to protect against Sybil attacks, as the membership is open. Each node has stake, e.g., the amount of money they own in the system, which is used to assign weight to their random number. Thus, the more stake a node has, the higher the chances to be elected and vice versa. However, if all participants are known, the election itself can be executed without Sybil attack resistance.

5.2.2 Proof-of-Elapsed-Time

An alternative approach is used by Intel’s Proof-of-Elapsed-Time (PoET) [1] to elect block proposers. It leverages TEEs and a registration process based on linkable attestation, i.e., attestation directly tied to a specific processor, to prevent Sybil attacks. PoET enforces a waiting time based on the random number, effectively simulating a Proof-of-Work without the corresponding power consumption. For this feature the TEE is used as well, which attests that the respective node has indeed waited for its assigned amount of time, i.e., generating a waiting certificate. The waiting approach reduces message complexity. Instead of having all nodes announce their respective number virtually simultaneously, each node has to first wait for that amount of time. This way, honest nodes with a comparatively high number will also wait longer and may observe lower-valued waiting certificates meanwhile. In this case, the node can decide not to announce its own wait time, saving overhead as only a minor part of the network needs to announce their respective wait certificates.

5.2.3 Deterministic Random Jury

Another simplified approach is to use a single verifiable source of randomness instead of many. Thus, the drawn number elects the whole jury. This way, instead of having all nodes announce their number individually, the whole network would deterministically know who is part of the next jury. For example, a counter of the SADAN round could be concatenated with all of the identities of the previous jury and subsequently hashed as the source of randomness. This assumes a known list of participants and is prone to errors as it may be the case that elected nodes are currently not available, e.g., crashed.

5.3 Consensus

5.3.1 Simple Majority

Keeping a consistent order of the jury decisions is crucial, as multiple simultaneous blame requests may occur that depend on each other. For example, one round may elect a juror that is expelled in another. However, if we assume to elect a new jury every round and every juror has a random number, we can extract an inherent order of requests. On two conflicting requests, there will be two separate elections with two separate juries. In such a case, the juries decide which request is executed first by comparing their election results. With the order being ensured, a simple majority vote among the jury suffices.

5.3.2 Byzantine Fault Tolerance (Bft)

BFT is used to find an ordered consensus among a group (see Section 2.2). As BFT will ensure consistency regarding the order of processed requests, an election of a new jury every round is not necessary. This way, we can keep an elected jury for a selectable time window. While this reduces the overhead involved with the election over multiple rounds, it induces overhead of BFT in every round.

Executing BFT with a random subset of the group is unique compared to traditional schemes. It is usually assumed every node participates in the agreement process; thus, if there are more Byzantine nodes than the scheme can formally endure, it is impossible for the process to succeed. However, in our case every election will have a diverse agreement group and may succeed where the previous jury failed. Therefore, a failed BFT agreement does not prevent progress in SADAN, as the failure to return a result can trigger a new election resulting in a new jury that is likely to proceed. However, we consequently need to consider an additional negative case that we label the total fail case, in which more than two-thirds of the jury are Byzantine, allowing for results that are incorrect. In such a case, the adversarial jurors are able to collude and enforce a malicious decision on the whole system. In Section 7.2 we examine the probabilities of both negative events.

5.3.3 Bft with Enhancements

In Section 7.2 we show that with increasing jury size, the probability to fail decreases significantly. However, a larger jury also implies a larger overhead due to the message overhead of BFT. To counter this, different enhancements of PBFT can be employed to reduce complexity. For example, the speculative case [60], skipping PBFT phases, or the optimistic case [31], halving the consensus group. However, both reduce overhead only for the benign case. Another approach is to involve a message aggregation scheme [56] to significantly reduce message complexity. Thus, if we expect Byzantine events to be rare, it may be feasible to consider larger jury sizes with inherently better security guarantees. If we have a trusted component available, we can also employ a trusted monotonic counter, which removes the need for BFT’s prepare phase entirely as well as reducing the required quorum to half plus one nodes [95, 66].

6 Our Instantiation of Sadan

In this section we spawn an instance of SADAN with specific components. For this, we consider a smart traffic scenario [92] where individual vehicles collaborate by sharing sensor data. We specifically focus on GPS data, which is crucial for the vehicles to exchange, e.g., to avoid collisions. We use SADAN in this scenario to identify adversarial vehicles that send altered GPS data and endanger other vehicles. The vehicles themselves have numerous Electronic Control Units (ECUs), which are responsible for providing sensor data. We use remote attestation as the integrity validation scheme on the responsible ECU, to ensure tamper-free GPS data. Typically, the infotainment system is the most powerful computing device in the vehicle, but only responsible for uncritical tasks. The infotainment system also provides the interface to the outside world and acts as a proxy for the ECUs; they are connected via an in-vehicle bus like CAN [49] or FlexRay [50]. Therefore, it is well suited to execute both our random election scheme as well as the consensus protocol. We will expand on the specific platform we used for our evaluation in Section 8.1.

We chose components based on the analysis of options in Section 5. In the following we will describe our rationale for choosing the individual components and how they work in our instance.

6.1 Device Run-time Attestation

For integrity validation in our SADAN instance we chose Data Integrity Attestation (DIAT) [5], which targets trustworthy data exchange for collaborative autonomous vehicles. DIAT provides strong security guarantees, as it can detect static code modification as well as code-reuse attacks (including some non-control data attacks [43]), while working efficiently on embedded systems.

Within a vehicle, critical sensor data—in our scenario GPS data—is processed by various software modules on an ECU. To ensure that the GPS data has not been manipulated by compromised software on the ECU, DIAT tracks the execution path of all software modules that access the GPS data. Hence, any unintended modification of the GPS data is recorded and included in an attestation report . DIAT employs a security architecture, which can be software-based [54], hardware-based [12] or hybrid [16], to isolate all software modules and ensure the authenticity and integrity of the attestation report. Before the GPS data leaves the ECU it is augmented with the attestation report. The resulting message is integrity protected and authenticated using digital signatures.

Other vehicles, that receive the GPS data, can use to validate that all modules that did access the data only performed benign manipulation to the data. If the validation of fails, the receiver will announce the failed attestation to the network, i.e., blame the sender. For this, the receiver constructs a blame message , including the data it received, the associated attestation report and the identity of the sender device.

6.2 Random Jury Election via PoET

We decided to use an approach based on Intel’s PoET [1] for the random jury election. The scheme has two primary advantages over its alternatives. First, the election mechanism itself is not directly tied to any incentives. Second, the waiting concept of PoET allows to significantly reduce message complexity of the election.

While based on PoET we elect multiple jurors instead of one. It works as follows for jurors:

  1. As soon as a node receives a blame message containing the attestation report

    , it will generate a random waiting time chosen from a exponential distribution and wait for the generated amount of time. The waiting time can be in the range between a chosen minimum

    and maximum .

  2. Each Node will receive waiting certificates from other nodes and compile a Jury Election Leaderboard , a sorted list containing waiting certificates with the lowest observed waiting times. When receiving a waiting certificate, the node will first check if the certificate has merit, i.e., if the waiting time is lower than the largest entry in or if . Nodes refrain from validating and forwarding certificates if they do not have merit at that time. Otherwise, the node will check the validity of the certificate and add it at the corresponding position in its . If , the last entry is removed so .

  3. After a node is finished waiting, it will check if contains less than waiting certificates from other nodes or if its own waiting time is smaller than any of the entries in . If its own certificate has merit, it will announce it to the network and add it to its . If not, it will discard its certificate.

  4. After a node additionally waited for a pre-defined time threshold , it will assume its to be mostly complete. If the node’s own certificate is still in , it will assume to be part of the jury. If so, it will start the agreement process with the other jurors in .

As long as the adversary cannot partition the network for long periods, all nodes will converge towards an identical Jury Election Leaderboard consisting of the waiting certificates with the smallest waiting time (we elaborate more in Section 7.1).

6.3 Byzantine Agreement

To find a consensus among the jurors about a blamed node, we decided to use BFT. This eliminates the need to do an election on every blame, significantly reducing the overhead of the elections over multiple rounds. Especially if multiple adversaries are detected in quick succession, BFT can have a higher throughput. However, we decided to implement PBFT [21], and thus we refrained from implementing any BFT enhancements. Other works in the BFT literature compare their performance against PBFT, and thus it can be considered as the baseline. We decided to use this baseline rather than introducing any further assumptions or performance considerations of the alternatives.

Our scheme works as follows, using PBFT as a subprotocol:

  1. After the election, the Jury Election Leaderboard has a sorted list of the lowest wait times for each juror. The juror with the lowest wait time will be the primary.

  2. On conflicting blame requests and elections, the jury containing the overall shortest waiting time is selected for the next round. Thus, the initial round for a new jury can skip the prepare phase entirely.

  3. If the primary acts unambiguously incorrectly, as well as triggering a view-change, jurors will also blame the primary.

  4. After a successful prepare phase all jurors will proceed to individually validate the attestation report included in the agreed on blame message .

  5. The jurors will execute the commit phase to agree on the attestation result and the measures to be taken, i.e., the jury decision (see Section 9).

  6. The reply from each juror is broadcast to the entire network, containing at least two thirds of all jurors’ signatures. The rest of the network can consider each valid and consistent decision message on the same blame to be equivalent. This avoids separately spreading up to inconsequentially different decision messages.

As we execute the protocol only on a subset of nodes, the protocol can fail due to safety or liveness. However, in our application a safety violation is far more problematic than a liveness violation, since a failure to reach consensus can be rectified by requesting a new jury. We will discuss these probabilities in Section 7.2.

6.4 Communication Aspects

Two communication aspects have implications on the performance of our instance, and thus for the evaluation in Section 8. As this is not a primary focus of this work, we discuss alternatives on these aspects briefly in Section 9. The first aspect is the way how we broadcast messages. We implemented a flooding based broadcasting protocol. Every node forwards broadcast messages to all neighbors, except the one from which the message was originally received. This way, a message will take the optimal paths, and thus flooding is optimal regarding run-time.

Further, to reduce message overhead in terms of actual sent out bytes, we used a collective signature scheme. In the consensus phase all jurors have to individually consent by providing their own signatures. As we evaluate different jury sizes, we decided to implement the Schnorr signature scheme. This way, increasingly adding signatures to a message does not result in increasingly bigger BFT messages.

7 Security Evaluation

In this section we evaluate SADAN’s security and analysis on the probabilities of the random jury election to fail. We will show how different parameters affect SADAN and provide the foundation for selecting a feasible configuration.

7.1 Security Consideration

The adversary’s goal is to either evade being identified (detected) by SADAN, or to misuse SADAN to manipulate the overall system, e.g., by having benign devices considered malicious by the system and sanctioned, or both simultaneously. Subsequently, we will individually explain each goal and why it cannot be achieved by the adversary.

Evade identification.

To evade the identification of nodes controlled by the adversary, the adversary can follow different strategies: (1) try to prevent being detected initially, (2) prevent being blamed, (3) prevent that an agreement is found identifying the adversary-controlled node666The adversary can also try to prevent being sanctioned by the system, however, the overall system’s reaction is not the focus of this work..

Strategy 1: To avoid initial detection the adversary can (a) stop interacting with the overall system and not participate in the integrity validation, or (b) behave correctly according to the used integrity validation scheme used. By not interacting with the system the adversary isolates itself while at the same time not answering to integrity validation request will ultimately lead to the conclusion that a node is not behaving correctly. However, given an appropriate integrity validation scheme the adversary will only pass validation by behaving correctly, in which case the system is not endangered by it. Otherwise, (c) the adversary has to break the integrity validation scheme.

This means the adversary only succeeds when one of the assumption of the integrity validation scheme are violated, i.e., SADAN is secure with respect to the first attack strategy as long as the assumptions of the used integrity validation scheme hold.

Strategy 2: Once an adversary-controlled node has been recognized by another node, this node will send out a blame message to inform the network. To prevent this, the adversary (a) can compromise the blamer node, (b) suppress the communication from the blamer node, or (c) vilify the blamer node.

The adversary would need to compromise the blamer before it is able to send out the blame message, which we consider out of scope (cf. Section 3.1). However, even if the adversary manages to compromise the blamer node, this node will eventually be verified itself and reported to be compromised.

In order to suppress the communication of the blamer node the adversary needs to control all communication channels of the blamer node777According to our assumptions (cf. Section 3) the adversary cannot prevent the broadcast of messages..

Lastly, the adversary might try to discredit the blamer so other nodes will not believe the blame, i.e., the compromised node will broadcast a blame message accusing the blamer node. In this situation both nodes will be examined by a jury uncovering the real adversary.

Hence, the adversary only succeeds by preventing the broadcast of blame messages, i.e., SADAN is secure against the second attack strategy as long as the assumptions hold that the adversary does not have complete control over the network (cf. Section 3).

Strategy 3: Finally, the adversary can try to prevent that the network finds agreement regarding the compromise of a node. The adversary can (a) try to sabotage the election/forming of a jury, (b) control sufficiently many members of the jury, (c) prevent interaction between jury members, or (d) the proclamation of the result to the network. Finally, (e) the adversary can distort the random jury election process to cause inconsistencies within the network that will affect the decision making in the subsequent consensus phase.

To sabotage the jury election and forming, the adversary needs to prevent communication in the network, which we deem out of scope (cf. Section 3.1). The adversary could also subvert the nodes to be part of the jury, e.g., to shut them off. However, with high probability (cf. Section 7.2), some nodes will be elected that are not compromised by the adversary.

In order to control a sufficiently large number of jury members the adversary can either compromise the jury members on-demand once they are elected. This however, requires the adversary to be able to rapidly compromise many nodes of the network, which contradicts our adversary model (cf. Section 3.1). Otherwise, the adversary has to break the random jury election scheme to reliably get nodes that are under its control to be elected as jurors. Since the jury is randomly selected there is a chance that the adversary-controlled nodes get elected. As we show in the subsequent section (Section 7.2) this chance is negligible with the right choice of parameters.

To prevent the benign jurors from finding an agreement the adversary can disturb their communication, which again, means that the adversary would need to control large parts of the network, violating our networking assumptions.

Furthermore, the adversary could try to prevent the jury from announcing the agreed-on result to the network, which also means that the adversary needs to control the network communication.

Finally, the adversary could try to manipulate the jury election process in order to prevent devices in the network to learn the correct list of jury members. As a consequence, these devices would not accept the decision of the jury leading to inconsistencies between different nodes of the network. However, this would require an adversary that can permanently prevent the wait certificates by legitimate jury members from arriving at selected devices. Given that the random jury election scheme does provide the guarantee that the elected jury is eventually known to the entire network, all devices will eventually accept the decision made by the legitimate jurors as soon as they learn the list of legitimate jurors and receive the consistent decisions of a sufficient large number of those jurors. Even if some devices do not learn the decision of the jury, i.e., have differences in due to waiting certificates being withheld by the adversary, this will have the effect of reducing the fault-tolerance of the Byzantine agreement to follow, with ‘shortest ’ nodes missing from a node’s being replaced by other nodes from outside this set, essentially manifesting as an additional fault. Thus, the security of SADAN depends on the security provided by the used schemes.

Hence, in order for the adversary to succeed with strategy 3 it has to break one of the used schemes (random jury election, integrity validation or consensus finding), has to control the network communication of the entire (or at least large parts of the) system, or be able to quickly compromise all jury members. Each of these attacker capabilities violate our system and adversary model.

Manipulate system.

The adversary can also try to manipulate the system by misusing SADAN. In particular, by blaming benign nodes the adversary can try to get them sanctioned, e.g., excluded from the network to increase its own share of the network. However, to achieve this the adversary either has to alter the integrity validation report of a benign node to convince the jury that the node is compromised. This means the adversary has to break the integrity validation scheme, in particular the authentication method used by (a) either extracting a secret from a shielded location or (b) by breaking a cryptographic primitive like signatures.

Alternatively, the adversary can aim to gain control over a decision making majority of the jury to come to a malicious agreement that will be accepted by the entire network. Here the same arguments hold as discussed above for strategy 3b and the probability of success by chance analyzed below (Section 7.2).

In summary, the adversary can only misuse SADAN when breaking one of the underlying schemes or with negligible probability, and thus fulfills the requirement for Adversary Detection and Identification (cf. Section 3.2).

7.2 Probabilistic Analysis

SADAN includes the random election of a jury of size , that has temporary authority to jointly make a decision over a blamed node. The joint decision is based on a Byzantine agreement, which means it fails if more than jurors are adversarial [21]. As our scheme randomly elects a small set of nodes as jurors, it may happen that enough adversarial nodes are elected for the Byzantine agreement to fail. This section discusses the probabilities regarding the adversary share in the network as well as regarding the chosen . While a larger reduces the chances of a failed election, BFT also induces a message complexity of . In Section 8.6 we will evaluate this effect in our simulation.

If we have total nodes in our system, with of them being adversaries and elect jurors, the probability of electing at least adversarial nodes is:


Where and is the generalized hypergeometric function. Equation 1

is the cumulative distribution function of the hypergeometric distribution.

While this equation models the probability for the Byzantine agreement to fail, we can rectify a liveness violation by re-election, as described in Section 6.3. In some applications, it may make sense to accept reduced fault-tolerance by increasing the message-count threshold used by PBFT from to some greater value . Then, faults are sufficient to cause a liveness violation, but a safety violation requires a greater number

of faults. If the protocol reaches an impasse, another consensus round, including a new jury, is started that may succeed. This can be modelled as a Markov chain: we begin in an initial “undecided” state and transition to a “success” state if no more than

adversarial nodes are elected—guaranteeing agreement—and a “failure” state if at least adversarial nodes are elected—allowing a safety violation. The failure state will eventually be reached with probability


and it will take on average elections to leave the “undecided” state.

Figure 2: The probability of eventual safety violation of Byzantine agreement with a population size of given a threshold of (a) and (b) , as well as the mean number of juries needed before agreement terminates, whether in success or total failure, for in Figure (c). The distinctly colored graphs depict the probability development for different jury sizes . Note that the case depicted in (a) will always either terminate or suffer a safety violation with a single jury election, unlike that in (b) and (c) where several juries may be necessary.

Besides the threshold , a primary factor affecting the probability of an eventual safety violation is the jury size . The more jurors are elected per round, the lower the probability for the Byzantine agreement to fail. We illustrate the influence of the jury size and BFT threshold in Figure 2. Increasing the threshold drastically reduces the probability of eventual failure; however, when the adversarial share is large, it increases the number of consensus failures before agreement is finally reached. The choice of jury size and threshold is therefore application-dependent, depending upon the appropriate trade-off between failure probability, time to reach agreement, and performance. We evaluate the latter of these considerations in Section 8.6.

8 Performance Evaluation

In this section we evaluate of our SADAN instance for the smart traffic scenario (cf. Section 6). After we describe our reference system we describe the setup for our large scale simulation campaigns. The results of these campaigns will be examined in the subsequent sections. We first analyze the effects of differently chosen wait time parameters. These parameters need to be chosen carefully to ensure the random election is consistent. Afterwards, we examine the scalability of our SADAN instance for large networks regarding run-time and messaging overhead, showing sub-linear run-time growth in regards to network size. Finally, we show how this performance is affected by choosing different jury sizes.

8.1 Reference System Measurements

As explained in Section 6 we use a vehicle’s infotainment system as the central platform for SADAN. We also chose to use the original PoET implementation. Intel indeed offers special processors for infotainment systems in vehicles, namely the Intel Atom A3900 Series [48]. However, this line of processors is not openly purchasable. Nevertheless, the Intel N5000 series is based on the same architecture as the A3900, which we used as our reference system. Therefore, this is representative in terms of processing power found in a car. To evaluate the attestation scheme, we rely on the performance measurements reported for Data Integrity ATtestation (DIAT[5], specifically the attestation of the GPS module.

wait certificate generation attestation generation attestation validation BFT process + Schnorr-sign
89 ms 835 ms 849 ms 1 - 4 ms
blame message waiting certificate decision message BFT message
4276 Bytes 192 Bytes 184 Bytes 188 Bytes
Table 1: The measured run-times and message sizes of individual processing steps

The top half of Table 1 shows our measurements regarding run-time, while the attestation numbers are from DIAT [5]. The BFT and Schnorr signing is fluctuating depending on jury size, so for the simulation, described in Section 8.2, we chose to use the worst-case (5 ms). The bottom half of Table 1 shows the sizes of the messages being sent out. The blame message contains the attestation report, 4096 Bytes in size, as reported in DIAT [5]. The wait certificate itself is 140 Bytes in size. The BFT and decision messages also contain a 128 Bytes long Schnorr signature. All message sizes contain 40 Bytes of TCP and IP headers.

8.2 Simulation

To evaluate the performance of SADAN for large numbers of devices, we used the OMNeT++ network simulator [74]. We implemented SADAN at the application layer and used the measurements described in Section 8.1 to set the processing times for the individual steps taken by each node. The communication delay between any two devices was set to 5 ms. We argue this is reasonable, as 5 Ghz Wi-Fi can already provide this to date as well as the promises of the upcoming 5G cellular communication technology to be able to reach latencies below 1 ms [39, 77].

The network is configured in a square mesh topology, with roughly the same height and width. Every node has four links to its neighbors, except the nodes at the edge of the network. There is a simple on-demand routing algorithm in place, but we consider the overhead for calculating routes out of scope, therefore, it does not contribute to the run-time measurements.

We evaluate SADAN for different network sizes, from 1000 to 100000 nodes. We also split measurements into the different phases. We simulate the first round of our SADAN instance with the following phases888Subsequent rounds will be faster, as no election is needed.:

  1. Initial Attestation: Generation of the initial attestation report by the blamed node and the integrity validation of by the blamer.

  2. Blame: Broadcasting the initial blame message.

  3. Election: The election process to elect jurors.

  4. BFT: The Byzantine Fault Tolerance scheme, including the validation of by each juror.

  5. Decision: Broadcasting the outcome of the BFT.

Notice that this represents the worst case, i.e., the upper bound regarding run-time and message overhead, as it includes both the election and the complete BFT. Further, to minimize variation of individual simulation runs, due to the random nature of our scheme, we average every individual parameter configuration over 30 runs with different random numbers.

8.3 Election Wait Time

We evaluate the time parameters and , as they contribute significantly to the performance characteristics of SADAN. is the maximum wait time regarding the randomly chosen wait time for each node. After a node is done waiting, it will wait an additional time while collecting other waiting numbers. Afterwards, it will assume the election to be mostly complete, i.e., to have a mostly matching . If is chosen very small, the election itself will be faster; however, the individual s may also be still inconsistent among the nodes. To measure this effect we executed a parameter study for differently chosen time parameters with , and .

Figure 3 shows the results. In (a) we can see the effect on the execution time of one round. It is primarily tied to , as can be seen if comparing different that result in the same . The graph (b) in turn shows how many unjustified BFT messages were received. While a lower reduces the execution time, it also increases the number of nodes falsely assuming to be jurors.

Figure 3: (a) The time for one round, (b) the total number of non-juror BFT messages, (c) the average number of election messages per node and (d) the total number of reached decisions, all for differently chosen and . Simulated with , .
Figure 4: (a) The completion time of each phase, (b) the average number of messages sent per node and (c) the average data sent per node, all split into the individual phases for increasing . Simulated with .

Figure 3 (c) shows the average number of messages per node for the election phase. This shows how many nodes actively participate in the election, i.e., nodes assuming their wait time still has merit after waiting for . However, (d) shows how many jurors get to the point of sending out a decision. This measurement should ideally match the chosen , so 22 in this case. Yet, it can be seen that if the time parameters are chosen too small, not the entire jury can reach a decision or none at all, as the nodes’ will diverge to the point where no quorum can be established among the jury.

With these measurements in mind, we chose a time parameter configuration for our further evaluation, keeping the discovered trade-offs in mind: and for . We used a dynamically adjusted configuration per network size, based on the worst route a message can take. As we employ a square mesh network with a communication delay of 5ms for one hop, we use and .

8.4 Per-Phase Performance for Large Networks

In this section we examine the Efficiency and Scalability of SADAN, two main requirements (cf. Section 3.2). Figure 4 (a) shows the run-time measurements. Note that the measurements per phase are denoted as the absolute simulation time at the last processed message of the respective phase—phases overlap as progress is made in parallel. The top purple line represents the time of the last received decision message in the network, and thus the total time for one entire SADAN round. A network of takes .

A naive and simplified solution to the problem would be to let all devices attest every other device individually. The time this case takes for nodes can be expressed as . This does not take communication delay into account and assumes perfect parallelization between the nodes. This naive case would take over 14 minutes for and over hours for .

Figure 4 (a) also shows the individual measurements per phase. The third line in green shows how long the election takes. The time for the election overlaps with the blame broadcast, implying that the election requires the most time of the scheme. The red line, second from the top, shows when the BFT is finished. Figure 4 (b) and (c) show the message overhead per phase both in terms of count and sent out bytes. Note that we consider all messages for these measurements, i.e., including forwarded messages by nodes in between the route. Both graphs reveal that the election phase (green bars second from the left respectively) generates the most overhead both in message count and size in bytes. The total message overhead for is 305.69 messages (71.58kB) per node. However, assuming a subsequent round with a jury already in place, the total message overhead without the election phase is reduced to 9.84 messages (16.10kB) per node. We also simulated malicious blames in the same manner and evaluate it subsequently.

8.5 Malicious Blame Evaluation

Figure 5: The completion time of each phase on a malicious blame. Simulated with

In this section we examine how a malicious, i.e., unjustified, blame affects our simulation. In every round our blamer is an adversary, and thus will be blamed itself when its original blame turns out to be unjustified. For this, we extend the definition of the individual phases defined in Section 8.2.

After the initial BFT phase, we additionally define:

  1. BlamerAtt: The primary requesting an attestation report from the blamer, generation of the blamer’s attestation report and distribution of to the other jurors.

  2. BlamerBFT: The second BFT round, including the validation of by each juror.

  3. Decision: Broadcasting the outcome of the second BFT.

Figure 5 shows the runtime measurements. A network of takes for a malicious blame compared to for a benign blame. Considering the first BFT round takes around 4 seconds, the attestation report generation as well as validation takes around 2 seconds and some routing overhead, these results do not deviate from our expectations.

8.6 Jury Size

The following examines the effects of differently chosen jury sizes on our instance of SADAN. Figure 6 (a) shows the run-time for one round. Even for large juries, like , in a large network, like , the difference on the run-time compared to is only (or 13.1%). This is due to the individual BFT steps being able to execute in parallel.

Figure 6: (a) The time for one round and (b) the average number of BFT messages sent per node for different .

The second graph (b) show the average message count per node of the BFT phase. The message complexity for two BFT phases are apparent. Nevertheless, the closest case we could find to compare the election overhead against the BFT overhead is and . Here the average message overhead per node for the election is 632.21 against the 417.95 for the BFT message overhead. Thus, one BFT round is more efficient than an election in overall terms. However, BFT

also concentrates the overhead on the jurors and the routes between them, compared to the more uniformly distributed overhead by the election.

9 Discussion

This section discusses possible extensions to SADAN.


Our instance of SADAN uses Flooding for broadcasting. However, in terms of message complexity a Gossip protocol may be advantageous in many use cases. Gossip protocols randomly send broadcast messages to a set number of neighbors, which in turn do the same [61]. These protocols are probabilistic in nature, yet, perform well on average and significantly reduce overhead compared to flooding [61].


An aspect that could be changed is the on-demand nature of the integrity validation. For example, one could have all nodes regularly check all their neighbors instead. This way, after a set span every node would be validated, so even a passive adversary cannot hide.

Dynamic Jury.

Further, the jury size does not have to be fixed over the life span of a SADAN instance. It might be advantageous to dynamically adjust the jury size when required, e.g., increase the jury size when many blames occur in a short time frame. This would dynamically adjust the security probabilities along the network’s needs at the time.

Final Decision.

Part of a practical instantiation of SADAN is the resulting reaction of the jury to a confirmed adversarial node. This by itself is a vastly complex topic and highly dependent on the use case. Nevertheless, this section sketches some possible approaches for practical implementation. Just straight-forward expelling the adversary with the jury decision is not feasible for many use cases, e.g., autonomous cars.

An interesting approach would be a form of self-healing mechanism. If a node is deemed adversarial it could be repaired, validated again and reintegrated into the network. For example, every node could be equipped with a dead man’s switch, i.e., a device that requires regular interaction or otherwise will trigger some recovery functionality. This dead man’s switch would be implemented as a trusted component with the need of regular approval from peers in the network. If a node is deemed adversarial by a jury, all benign nodes would refrain sending the regular keep-alive-signal, which in turn activates the dead man’s switch forcing a secure reset of the device. Additionally, a form of software hardening is conceivable, e.g., randomization of the memory layout of a node’s running software. With a confirmed adversary in the network, all nodes could also preventively execute such a hardening.

10 Related Work

Collective Attestation.

The first step towards scalable attestation of large groups of interconnected devices, i.e., collective attestation, was made by SEDA [13]. It proposes a scheme that spans a tree over the network topology to enable efficient aggregation of static attestation reports. Built on this approach, numerous enhancements have been proposed. SANA [6] enables anyone to verify the attestation reports while removing the need for trusted hardware for aggregation of reports. In SEED [46], a non-interactive trigger for the attestation is proposed based on a secure timer to prevent DoS attacks. WISE [7] proposes to reduce overhead after the initial complete attestation by attesting solely a subset of the network. SALAD [58] aims at highly dynamic networks to work under frequent network partitions. DARPA [45] additionally mitigates physical attacks by introducing an unforgeable heartbeat. It assumes that the physical attacker has to take devices offline for a minimum amount of time. In SCAPI [57], this approach is extended by regularly updating session keys.

However, all collective attestation protocols rely on a central authority, the so-called verifier, which has to supervise the attestation and needs to be trusted by the entire network.

Anomaly Detection.

Another approach to identify adversaries in a network is to detect abnormal behavior. In the field of Wireless Sensor Networks this is achieved by recognizing outliers in the aggregated sensor data of all sensors in the network [98, 15]. Among others, one source of these outliers can be malicious attacks [98]. This, however, assumes comparable data among the nodes.

Other works have focused on detecting anomalies in a system via statistics, pattern recognition and machine learning. This approach is thoroughly examined in the field of Intrusion Detection Systems, which have gained much attention in recent years

[65, 44, 88, 82]. However, these approaches assume a central entity conducting the analysis. Further, they target detection of specific attacks, such as a Botnet participating in DDoS attacks.

Byzantine Fault Tolerance (Bft) with random committee.

Most notable in this domain is Algorand [40] and Byzcoin [56]. They use a similar strategy to our work by selecting a subset of nodes in the network as the consensus group. This allows to use BFT in a scalable way to construct a cryptocurrency. We already thoroughly examined Algorand in Section 5.2.1. For selecting the consensus group Byzcoin requires to mine blocks via Proof-of-Work (PoW). Then, a chosen number of the last successful miners emerges as the group executing BFT. However, this approach is not feasible for our purposes. For example, in a heterogeneous network some less powerful nodes have a significant disadvantage in the election. Further, an adversary can use a powerful external machine to exceed the processing power of the entire network.

11 Conclusion

In this work, we presented SADAN, the first scheme to efficiently identify adversaries in large networks consisting of autonomous collaborating devices. SADAN combines random elections, consensus and integrity validation methods in a flexible scheme, where each of these components are interchangeable. We have demonstrated the scalability of an exemplary instance of SADAN as well as provided the basis to construct use-case specific instances of SADAN. In future work, we aim to improve SADAN’s flexibility (as outlined in Section 9) and examine enhancements to tolerate network partitions.


We thank N. Asokan (University of Waterloo) for his useful feedback. This work has been supported by the German Research Foundation (DFG) as part of projects HWSec and S2 within the CRC 1119 CROSSING, by the German Federal Ministry of Education and Research (BMBF) and the Hessen State Ministry for Higher Education, Research and the Arts (HMWK) within CRISP, by BMBF within the project iBlockchain, by the Academy of Finland (grant 309195), and by the Intel Collaborative Research Institute for Collaborative Autonomous & Resilient Systems (ICRI-CARS).


  • [1] Hyperledger sawtooth documentation on proof-of-elapsed-time., 2019.
  • [2] 3GPP. 5G Standard – Relase 16., 2019.
  • [3] Martín Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay Ligatti. Control-flow Integrity Principles, Implementations, and Applications. ACM Trans. Inf. Syst. Secur., 13(1), 2009.
  • [4] Tigist Abera, N Asokan, Lucas Davi, Jan-Erik Ekberg, Thomas Nyman, Andrew Paverd, Ahmad-Reza Sadeghi, and Gene Tsudik. C-flat: control-flow attestation for embedded systems software. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 743–754. ACM, 2016.
  • [5] Tigist Abera, Raad Bahmani, Ferdinand Brasser, Ahmad Ibrahim, Ahmad-Reza Sadeghi, and Matthias Schunter. Diat: Data integrity attestation for resilient collaboration of autonomous systems.
  • [6] Moreno Ambrosin, Mauro Conti, Ahmad Ibrahim, Gregory Neven, Ahmad-Reza Sadeghi, and Matthias Schunter. SANA: Secure and Scalable Aggregate Network Attestation. In ACM SIGSAC Conference on Computer and Communications Security, 2016.
  • [7] Mahmoud Ammar, Mahdi Washha, and Bruno Crispo. Wise: Lightweight intelligent swarm attestation scheme for iot (the verifier’s perspective). In 2018 14th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), pages 1–8. IEEE, 2018.
  • [8] Ittai Anati, Shay Gueron, Simon P. Johnson, and Vincent R. Scarlata. Innovative Technology for CPU Based Attestation and Sealing. In Workshop on Hardware and Architectural Support for Security and Privacy, 2013.
  • [9] Orlando Arias, Lucas Davi, Matthias Hanreich, Yier Jin, Patrick Koeberl, Debayan Paul, Ahmad-Reza Sadeghi, and Dean Sullivan. HAFIX: Hardware-Assisted Flow Integrity Extension . In 52nd Design Automation Conference (DAC), June 2015.
  • [10] ARM Limited. ARM Security Technology: Building a Secure System using TrustZone Technology., 2008.
  • [11] ARM Limited. Security technology: building a secure system using TrustZone technology., 2008.
  • [12] ARM Limited. TrustZone Technology for ARMv8-M Architecture., 2017.
  • [13] N. Asokan, Ferdinand Brasser, Ahmad Ibrahim, Ahmad-Reza Sadeghi, Matthias Schunter, Gene Tsudik, and Christian Wachsmann. SEDA: Scalable Embedded Device Attestation. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS, 2015.
  • [14] Kjell Braden, Stephen Crane, Lucas Davi, Michael Franz, Per Larsen, Christopher Liebchen, and Ahmad-Reza Sadeghi. Leakage-Resilient Layout Randomization for Mobile Devices. In Annual Network and Distributed System Security Symposium, 2016.
  • [15] Joel W Branch, Chris Giannella, Boleslaw Szymanski, Ran Wolff, and Hillol Kargupta. In-network outlier detection in wireless sensor networks. Knowledge and information systems, 34(1):23–54, 2013.
  • [16] Ferdinand Brasser, Patrick Koeberl, Brahim El Mahjoub, Ahmad-Reza Sadeghi, and Christian Wachsmann. TyTAN: Tiny Trust Anchor for Tiny Devices. In IEEE/ACM Design Automation Conference, 2015.
  • [17] Erik Buchanan, Ryan Roemer, Hovav Shacham, and Stefan Savage. When Good Instructions Go Bad: Generalizing Return-oriented Programming to RISC. In Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS, 2008.
  • [18] Vitalik Buterin, Gavin Wood, and Joseph Lubin. A next-generation smart contract and decentralized application platform., 2015.
  • [19] Eric Byres and Justin Lowe. The Myths and Facts behind Cyber Security Risks for Industrial Control Systems. Technical report, PA Consulting Group, 2004.
  • [20] Xavier Carpent, Karim ElDefrawy, Norrathep Rattanavipanon, and Gene Tsudik. Lightweight Swarm Attestation: A Tale of Two LISA-s. In ACM Symposium on Information, Computer and Communications Security, 2017.
  • [21] Miguel Castro and Barbara Liskov. Practical Byzantine Fault Tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI, 1999.
  • [22] Stephen Checkoway, Lucas Davi, Alexandra Dmitrienko, Ahmad-Reza Sadeghi, Hovav Shacham, and Marcel Winandy. Return-oriented Programming Without Returns. In Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS, 2010.
  • [23] Stephen Checkoway, Damon McCoy, Brian Kantor, Danny Anderson, Hovav Shacham, and Stefan Savage. Comprehensive Experimental Analyses of Automotive Attack Surfaces. In USENIX Security Symposium, 2011.
  • [24] Shuo Chen, Jun Xu, Emre C. Sezer, Prachi Gauriar, and Ravishankar K. Iyer. Non-control-data attacks are realistic threats. In Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM, 2005.
  • [25] Eric Chien, Liam OMurchu, and Nicolas Falliere. W32.Duqu - The precursor to the next Stuxnet. Technical report, Symantic Security Response, 2011.
  • [26] Frederick B. Cohen. Operating system protection through program evolution. Comput. Secur., 12(6), 1993.
  • [27] Stephen Crane, Christopher Liebchen, Andrei Homescu, Lucas Davi, Per Larsen, Ahmad-Reza Sadeghi, Stefan Brunthaler, and Michael Franz. Readactor: Practical Code Randomization Resilient to Memory Disclosure. In IEEE Symposium on Security and Privacy, 2015.
  • [28] Lucas Davi, Ahmad-Reza Sadeghi, Daniel Lehmann, and Fabian Monrose. Stitching the gadgets: On the ineffectiveness of coarse-grained control-flow integrity protection. In 23rd USENIX Security Symposium (USENIX Security 14), pages 401–416, San Diego, CA, 2014. USENIX Association.
  • [29] Ghada Dessouky, Tigist Abera, Ahmad Ibrahim, and Ahmad-Reza Sadeghi. Litehax: Lightweight hardware-assisted attestation of program execution. In 2018 International Conference On Computer Aided Design (ICCAD’18), November 2018.
  • [30] Ghada Dessouky, Shaza Zeitouni, Thomas Nyman, Andrew Paverd, Lucas Davi, Patrick Koeberl, N. Asokan, and Ahmad-Reza Sadeghi. Lo-fat: Low-overhead control flow attestation in hardware. In 54th Design Automation Conference (DAC’17), June 2017.
  • [31] Tobias Distler, Christian Cachin, and Rüdiger Kapitza. Resource-efficient byzantine fault tolerance. IEEE Transactions on Computers, 65(9):2807–2819, 2016.
  • [32] John R Douceur. The sybil attack. In International workshop on peer-to-peer systems, pages 251–260. Springer, 2002.
  • [33] Karim Eldefrawy, Gene Tsudik, Aurélien Francillon, and Daniele Perito. SMART: secure and minimal architecture for (establishing dynamic) root of trust. In Proceedings of the 19th Annual Network and Distributed System Security Symposium, NDSS’12, 2012.
  • [34] F-Secure Labs. BLACKENERGY and QUEDAGH: The convergence of crimeware and APT attacks, 2016.
  • [35] Nicolas Falliere, Liam O. Murchu, and Eric Chien. W32.Stuxnet Dossier., 2010.
  • [36] Aurélien Francillon and Claude Castelluccia. Code Injection Attacks on Harvard-architecture Devices. In Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS, 2008.
  • [37] Aurélien Francillon and Claude Castelluccia. Code injection attacks on harvard-architecture devices. In Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08, pages 15–26, New York, NY, USA, 2008. ACM.
  • [38] Aurélien Francillon, Quan Nguyen, Kasper B. Rasmussen, and Gene Tsudik. A minimalist approach to remote attestation. In Design, Automation & Test in Europe, 2014.
  • [39] Gemalto. ntroducing 5g networks – characteristics and usages., 2019.
  • [40] Yossi Gilad, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. Algorand: Scaling byzantine agreements for cryptocurrencies. In Proceedings of the 26th Symposium on Operating Systems Principles - SOSP ’17, page 51–68. ACM Press, 2017.
  • [41] Tal Grinshpoun, Amnon Meisels, and Eyal Felstaine. Avoidance of misbehaving nodes in wireless mesh networks. Security and Communication Networks, 7(7), 2014.
  • [42] Jason D. Hiser, Anh Nguyen-Tuong, Michele Co, Matthew Hall, and Jack W. Davidson. ILR: Where’d My Gadgets Go? In IEEE Symposium on Security and Privacy, 2012.
  • [43] H. Hu, S. Shinde, S. Adrian, Z. L. Chua, P. Saxena, and Z. Liang. Data-oriented programming: On the expressiveness of non-control data attacks. In IEEE Symposium on Security and Privacy, 2016.
  • [44] Yi-an Huang and Wenke Lee. A cooperative intrusion detection system for ad hoc networks. In Proceedings of the 1st ACM workshop on Security of ad hoc and sensor networks, pages 135–147. ACM, 2003.
  • [45] Ahmad Ibrahim, Ahmad-Reza Sadeghi, and Gene Tsudik. DARPA: Device Attestation Resilient against Physical Attacks. In ACM Conference on Security & Privacy in Wireless and Mobile Networks (WiSec), 2016.
  • [46] Ahmad Ibrahim, Ahmad-Reza Sadeghi, and Shaza Zeitouni. Seed: secure non-interactive attestation for embedded devices. In Proceedings of the 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pages 64–74. ACM, 2017.
  • [47] Intel. Intel Software Guard Extensions Programming Reference., 2014.
  • [48] Intel. Intel atom automotive processors., 2019.
  • [49] International Organization for Standardization. ISO 11898-1:2003., 2003.
  • [50] International Organization for Standardization. ISO 17458-1:2013., 2013.
  • [51] A. Jaeger and S. A. Huss. The weather hazard warning in simTD: A design for road weather related warnings in a large scale Car-to-X field operational test. In 11th International Conference on ITS Telecommunications, 2011.
  • [52] Michel E. Kabay. Attacks on Power Systems: Hackers, Malware., 2010.
  • [53] Chongkyung Kil, Jinsuk Jun, Christopher Bookholt, Jun Xu, and Peng Ning. Address Space Layout Permutation (ASLP): Towards Fine-Grained Randomization of Commodity Software. In Annual Computer Security Applications Conference, 2006.
  • [54] Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. seL4: Formal Verification of an OS Kernel. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, 2009.
  • [55] Patrick Koeberl, Steffen Schulz, Ahmad-Reza Sadeghi, and Vijay Varadharajan. TrustLite: A Security Architecture for Tiny Embedded Devices. In European Conference on Computer Systems (EuroSys), 2014.
  • [56] Eleftherios Kokoris Kogias, Philipp Jovanovic, Nicolas Gailly, Ismail Khoffi, Linus Gasser, and Bryan Ford. Enhancing bitcoin security and performance with strong consistency via collective signing. In 25th USENIX Security Symposium (USENIX Security 16), pages 279–296, 2016.
  • [57] Florian Kohnhäuser, Niklas Büscher, Sebastian Gabmeyer, and Stefan Katzenbeisser. Scapi: a scalable attestation protocol to detect software and physical attacks. In Proceedings of the 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pages 75–86. ACM, 2017.
  • [58] Florian Kohnhäuser, Niklas Büscher, and Stefan Katzenbeisser. SALAD: Secure and Lightweight Attestation of Highly Dynamic and Disruptive Networks. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, ASIACCS, 2018.
  • [59] Karl Koscher, Alexei Czeskis, Franziska Roesner, Shwetak Patel, Tadayoshi Kohno, Stephen Checkoway, Damon McCoy, Brian Kantor, Danny Anderson, Hovav Shacham, and Stefan Savage. Experimental Security Analysis of a Modern Automobile. In IEEE Symposium on Security and Privacy, 2010.
  • [60] Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. Zyzzyva: Speculative byzantine fault tolerance. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, SOSP ’07, page 45–58. ACM, 2007.
  • [61] Joanna Kulik, Wendi Heinzelman, and Hari Balakrishnan. Negotiation-based protocols for disseminating information in wireless sensor networks. Wireless networks, 8(2/3):169–185, 2002.
  • [62] Volodymyr Kuznetsov, Laszlo Szekeres, Mathias Payer, George Candea, R. Sekar, and Dawn Song. Code-Pointer Integrity. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2014.
  • [63] Leslie Lamport, Robert Shostak, and Marshall Pease. The byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3):382–401, 1982.
  • [64] Per Larsen, Andrei Homescu, Stefan Brunthaler, and Michael Franz. SoK: Automated Software Diversity. In Proceedings of the 2014 IEEE Symposium on Security and Privacy, S & P, 2014.
  • [65] Hung-Jen Liao, Chun-Hung Richard Lin, Ying-Chih Lin, and Kuang-Yuan Tung. Intrusion detection system: A comprehensive review. Journal of Network and Computer Applications, 36(1):16–24, 2013.
  • [66] Jian Liu, Wenting Li, Ghassan O. Karame, and N. Asokan. Scalable Byzantine Consensus via Hardware-assisted Secret Sharing. IEEE Transactions on Computers, 2018.
  • [67] Prince Mahajan, Ramakrishna Kotla, Catherine C. Marshall, Venugopalan Ramasubramanian, Thomas L. Rodeheffer, Douglas B. Terry, and Ted Wobber. Effective and Efficient Compromise Recovery for Weakly Consistent Replication. In Proceedings of the 4th ACM European Conference on Computer Systems, 2009.
  • [68] Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V. Rozas, Hisham Shafi, Vedvyas Shanbhogue, and Uday R. Savagaonkar. Innovative Instructions and Software Model for Isolated Execution. In Workshop on Hardware and Architectural Support for Security and Privacy, 2013.
  • [69] Navamani Thandava Meganathan and Yogesh Palanichamy. Privacy Preserved and Secured Reliable Routing Protocol for Wireless Mesh Networks. The Scientific World Journal, 2014.
  • [70] Charlie Miller and Christopher Valasek. A Survey of Remote Automotive Attack Surfaces. In Blackhat USA, 2014.
  • [71] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system., 2008.
  • [72] Thien Duc Nguyen, Samuel Marchal, Markus Miettinen, Hossein Fereidooni, N. Asokan, and Ahmad-Reza Sadeghi. DÏot: A federated self-learning anomaly detection system for iot. In The 39th IEEE International Conference on Distributed Computing Systems (ICDCS 2019), March 2019.
  • [73] OneWeb. OneWeb Home Page., 2019.
  • [74] OpenSim Ltd. OMNeT++ discrete event simulator., 2019.
  • [75] Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis. Smashing the Gadgets: Hindering Return-Oriented Programming Using In-Place Code Randomization. In IEEE Symposium on Security and Privacy, 2012.
  • [76] Sang Hyun Park, So Hee Won, Jong Bong Lee, and Sung Woo Kim. Smart Home – Digitally Engineered Domestic Life. Personal Ubiquitous Comput., 2003.
  • [77] Imtiaz Parvez, Ali Rahmati, Ismail Guvenc, Arif I Sarwat, and Huaiyu Dai. A survey on low latency towards 5g: Ran, core network and caching solutions. IEEE Communications Surveys & Tutorials, 20(4):3098–3130, 2018.
  • [78] PaX Team. PaX address space layout randomization (ASLR)., 2001.
  • [79] Jonathan Pollet and Joe Cummins. Electricity for Free? The Dirty Underbelly of SCADA and Smart Meters. In Blackhat USA, 2010.
  • [80] Jerome Radcliffe. Hacking Medical Devices for Fun and Insulin: Breaking the Human SCADA System. In Blackhat USA, 2011.
  • [81] Ryan Roemer, Erik Buchanan, Hovav Shacham, and Stefan Savage. Return-Oriented Programming: Systems, Languages, and Applications. ACM Transactions on Information and System Security, 15(1), 2012.
  • [82] Martin Roesch et al. Snort: Lightweight intrusion detection for networks. In Lisa, volume 99, pages 229–238, 1999.
  • [83] Stefan Schmid, Theodoros Bourchas, Stefan Mangold, and Thomas R. Gross. Linux Light Bulbs: Enabling Internet Protocol Connectivity for Light Bulb Networks. In Proceedings of the 2Nd International Workshop on Visible Light Communications Systems, 2015.
  • [84] Arvind Seshadri, Mark Luk, and Adrian Perrig. SAKE: Software attestation for key establishment in sensor networks. In Distributed Computing in Sensor Systems. 2008.
  • [85] Hovav Shacham. The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86). In ACM SIGSAC Conference on Computer and Communications Security, 2007.
  • [86] F. Shrouf, J. Ordieres, and G. Miragliotta. Smart factories in Industry 4.0: A review of the concept and of energy management approached in production based on the Internet of Things paradigm. In IEEE International Conference on Industrial Engineering and Engineering Management, 2014.
  • [87] Meital Ben Sinai, Nimrod Partush, Shir Yadid, and Eran Yahav. Exploiting Social Navigation. In Blackhat Asia, 2015.
  • [88] Steven R Snapp, James Brentano, Gihan Dias, Terrance L Goan, L Todd Heberlein, Che-Lin Ho, and Karl N Levitt. Dids (distributed intrusion detection system)-motivation, architecture, and an early prototype. 2017.
  • [89] Robin Sommer and Vern Paxson. Outside the closed world: On using machine learning for network intrusion detection. In 2010 IEEE symposium on security and privacy, pages 305–316. IEEE, 2010.
  • [90] SpaceX. Starlink Mission., 2019.
  • [91] Raoul Strackx, Frank Piessens, and Bart Preneel. Efficient Isolation of Trusted Subsystems in Embedded Systems. In Security and Privacy in Communication Networks (SecureComm), 2010.
  • [92] The Conversation. Connected cars can lie, posing a new threat to smart cities., 2018.
  • [93] Trusted Computing Group (TCG). TPM Main Specification Level 2 Version 1.2., 2007.
  • [94] Trusted Computing Group (TCG). Trusted platform module., 2011.
  • [95] G. S. Veronese, M. Correia, A. N. Bessani, L. C. Lung, and P. Verissimo. Efficient byzantine fault-tolerance. 62:16–30, Jan 2013.
  • [96] Richard Wartell, Vishwath Mohan, Kevin W. Hamlen, and Zhiqiang Lin. Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code. In ACM SIGSAC Conference on Computer and Communications Security, 2012.
  • [97] Shaza Zeitouni, Ghada Dessouky, Orlando Arias, Dean Sullivan, Ahmad Ibrahim, Yier Jin, and Ahmad-Reza Sadeghi. Atrium: Runtime attestation resilient under memory attacks. In 2017 International Conference On Computer Aided Design (ICCAD’17), November 2017.
  • [98] Yang Zhang, Nirvana Meratnia, and Paul JM Havinga. Outlier detection techniques for wireless sensor networks: A survey. IEEE Communications Surveys and Tutorials, 12(2):159–170, 2010.