The wide spread deployment of IoT devices is enabling the automation of various aspects in our daily lives. One of the success stories of IoTs has been the automation of homes and cities which are referred as smart homes and smart cities. A significant obstacle towards realizing the true vision of smart homes is securing the communicated and stored data in the home network against malicious acts desiring to wreak havoc with someone’s home .
Though there are many competing technologies that try to immune data in smart homes against attacks, blockchain has emerged as probably the most promising for both i) securing the home network against manipulation attacks on stored data and ii) providing a secure platform for all the devices in the network to communicate with each other[2, 3]. In a blockchain, the data is immutable because of the underlying consensus protocols– a process by which all transactions are validated by all the nodes . Therefore, manipulation attacks on transmitted or stored data are not plausible through a single compromised node and majority of nodes ought to be compromised for a successful attack . Different consensus protocols and their applicability towards IoT networks can be found in [6, 1].
Using blockchain for the IoT devices in a smart home is non-trivial. This is primarily because the IoT devices are resource-constrained and might not be able to perform the extensive computations required to achieve consensus. For example, blockchain consensus protocols such as Proof of Work is contingent upon solving compute-intensive hash functions, having storage requirements, and needing low-latency communication links . Resource-constrained devices cannot fulfill these requirements . Recently, several novel consensus approaches have been proposed to overcome these limitations. One of the projects which tries to address these challenges is the hyperledger project – which is what we chose as our evaluation platform in this research.
We implement a blockchain-based IoT smart home network on hyperledger fabric which has low computational requirements and fast network response time that makes it desirable for IoT applications [6, 8]. Hyperledger fabric uses practical byzantine fault tolerance (PBFT) method which can successfully reach consensus over new data if the ratio of malicious devices is less than . On the other hand, most of the devices in a home network are resource constrained and very vulnerable to different types of cyber-attacks. Therefore, we should define a mechanism to detect malicious activities and the compromised devices and disconnect them from the rest of the network in order to exclude them from participating in the PBFT consensus protocol.
In this paper, we propose the AI-enabled blockchain (AIBC) network with a 2-step consensus protocol using an outlier detection algorithm as the first step and PBFT as the second one. Outlier detection algorithm discovers anomalies in multimodal data that is captured by the various IoT devices in a smart home network. We use multimodal data fusion to map the different data types received from different devices to an intermediate domain. Since the captured data from different sensors are not independent of each other, the outlier detection algorithm aims to learn the intrinsic structure of the data and exploits the inter-dependencies among data from different devices. Eventually, outlier data is rejected at the end of the first step and the corresponding device is refrained from going to the second consensus step (PBFT). Thus, the overall fault tolerance of the blockchain network is enhanced in comparison to a naive hyperledger fabric implementation. Fig. 1 shows the intuition of our proposed 2-step consensus protocol.
To the best of our knowledge, this is the first research work that tries to practically apply machine learning to reach consensus in a blockchain network. Gupta et al. have suggested the possibility of applying machine learning techniques to blockchain’s consensus process as a future research work without any investigation . Dinh et al. have discussed the benefits of integration of blockchain and artificial intelligence (AI) . They suggest that a blockchain governed by a machine learning algorithm might be able to detect attacks and invoke proper defence mechanisms or isolate the compromised component. We have successfully formulated and implemented this idea called AI-enabled blockchain (AIBC). Dey has proposed a utility function similar to the function used in  to detect anomaly . Then, he claims this value can be used as a feed for a supervised machine learning algorithm to measure the likeliness of an attack and prevent the blockchain confirmation of that transaction in the consensus protocol. However, he does not propose an algorithm or implementation to design a practical consensus protocol.
In order to test the validity of our 2-step consensus protocol for IoT networks, we implement a 3-layer architecture using hyperledger fabric. The first layer is the application layer containing different IoT devices. The second and third layers are the edge blockchain layer and the core blockchain layer that contain different components of the AIBC.
We measure the latency of different parts of our implementation and the number of outlier devices in each time unit using synthetic data represented as a matrix. We compare our proposed architecture with the conventional hyperledger fabric implementation and investigate the effect of the outlier detection algorithm on fault tolerance. Results reveal that our proposed AIBC network can reach consensus over new data in milliseconds with better fault tolerance than a naive hyperledger fabric implementation. This is achieved by detecting the malicious devices in the first consensus step and prohibiting them from participating in the PBFT step.
Ii Proposed System Architecture
In this section, we discuss the proposed 3-layer architecture and the communication protocol between the different components of the implemented AIBC network.
Ii-a Three-layer Architecture
The topology of our implemented 3-layer network is illustrated in Fig. 2. The first layer is the application layer containing smart devices, smart meters, and sensors ( to ) within a smart home network. The second layer is the edge blockchain layer that includes endorsing peers ( to ) and data aggregators. This layer endorses the new transactions ( to ) from applications and is partially responsible for the 2-step consensus protocol. The third layer is the core blockchain layer consisting of an orderer and regular peers ( to ). This layer creates a block out of the received endorsed transactions ( to , where and denote transaction results and their corresponding endorsements respectively) from the application layer and is also responsible for the 2-step consensus protocol.
There are organizations ( to ) in the AIBC network each of which contains one application, at least one endorsing peer (), none or few regular peers ( or ). For ease of illustration, only one regular peer and one endorsing peer are shown in each organization in Fig. 2. Endorsing peers have an aggregator to receive data (transactions) from the application layer, a copy of the chaincode (also known as smart contract, shown as C) to endorse new transactions, a copy of the ledger (shown as L), and the detector (shown as Det) to participate in the 2-step consensus protocol. Regular peers have only one copy of the ledger and the detector for participation in the 2-step consensus protocol to enhance the security of the blockchain network. Peers (both endorsing and regular) are involved in the 2-step consensus protocol. Thereby, as the number of peers ( and ) increases, more of them ought to be infected in order to prevent a successful consensus.
Ii-B Communication protocol in AIBC network
There exist two different communication protocols in the AIBC network: query process and invoke process.
Ii-B1 Query Process
In query process, an application connects to an arbitrary endorsing or regular peer to get updated about the current state of the ledger. This process requires only three steps: (i) the application connects to a peer directly (without an aggregator), (ii) the application sends query request to the peer, and (iii) the peer accesses its copy of ledger and sends the result back to the application.
Ii-B2 Invoke Process
During this process, an application connects to its corresponding endorsing peers to request a change in the ledger by its new data. A simplified version of our implementation with just one application () and its corresponding endorsing peer () is delineated in Fig. 3 to clarify the communication protocol in an invoke process. It should be noted that there are other applications, other endorsing peers and regular peers as shown in Fig. 2. Other endorsing peers communicate with their corresponding application and simultaneously traverse all the communication steps shown in Fig. 3 (steps 1 to 7) whereas the regular peers go through steps 5 and 6 only. The invoke process has three phases: (i) the endorsement phase (denoted with solid lines), (ii) the 2-step consensus protocol (denoted with dashed lines) and (iii) the ledger update (denoted with dotted lines) in Fig. 3.
This phase has 4 steps:
1. In this step, each application needs to connect to one or more endorsing peers according to the endorsement policy (not arbitrarily). Endorsement policy delineates which endorsing peers from which organizations are required to endorse a new transaction () proposed by an application (). In our implementation, each application communicates with only one endorsing peer as shown in Fig. 2.
2. The application () reads from a smart device, and sends its new readings to its corresponding endorsing peer (shown as ) in the AIBC network.
2.1. The endorsing peer executes the new transaction (data) in its copy of chaincode to endorse it.
2.2. If the transaction get endorsed successfully, the chaincode temporarily updates that endorsing peer’s copy of ledger as a proposed update.
3. The endorsing peer sends the result of the proposed transaction ( ) along with its endorsement ( ) obtained by the endorsing peer’s digital signature to the corresponding application.
4. The application checks all the received results along with their corresponding endorsements from corresponding endorsing peers to have the same result with valid endorsements. If these requirements are met, the application sends the result with its corresponding endorsements to the orderer to be added in the new block.
2-step Consensus Protocol: The two steps of the consensus protocol consist of the detector (which detects the outliers) and execution of PBFT.
5. The orderer combines all the received transaction results and their corresponding endorsements in a new block and sends it to all the peers (both endorsing and regular) to initiate the 2-step consensus protocol.
Step 1: Detector.
5.1(a) Each peer checks all the transactions within the block via its detector that uses the outlier detection algorithm discussed in Section III to find outlier data and reject it.
5.1(b) The detector excludes the peers associated with the organization containing the application that generated the outlier data from participating in the second step consensus. To do so, the detector updates the relay switch in order to notify its corresponding peer which set of peers (both endorsing and regular) it must connect to for the second step consensus. This step can prevent more than of compromised nodes to intervene in the PBFT consensus protocol to some extent– the exact fraction depends on the accuracy of the detector (discussed in Section III-C).
Step 2: PBFT.
5.2 Each peer verifies whether each of the transactions in the new block is endorsed by all the required peers specified in the endorsement policy. In addition, they check if the result of a specific transaction is the same from all the required endorsing peers. This is to ensure that the application is not compromised and has not sent an incorrect result for the transaction. All the transactions in the block are labeled as valid or invalid after verification of the endorsements by each of the peers. Then, the peers connects to their trusted set of peers according to the obtained relay switch in the previous step to reach a consensus over the new block using PBFT method.
Ledger Update: After the completion of the consensus protocol, the invoke process is
6. Each peer updates its copy of ledger.
7. Applications get notified about the ledger update.
Iii Outlier Detection
In order to detect malicious devices, we propose to employ machine learning for outlier detection as the first step of consensus. To that aim, the low-rank assumption is considered as the core model which is a popular assumption in machine learning . We implement the outlier detector using a secondary chaincode which is installed on all the peers in the blockchain network (shown by Det in Fig. 2). To apply the outlier detection algorithm to the multi-modal data gathered from different sensors and devices, we need to first fuse the data for a unified representation.
Iii-a Multimodal Data Fusion
Sensors and devices in a smart home deal with different types of data. To impose a model for the ensemble of data in AI-enabled systems, we need to map the data to a meaningful intermediate domain . The data from device at time slot is denoted by where is the dimension of received data from sensor . Let represent the fused data at the time slot where . Matrix represents collected data from devices over time slots. In other words, ].
Iii-B Outlier Detection Algorithm
For detecting inconsistent data, we use low-rank data structure which is a typical assumption for many real-life data . This model is one of the most well-known data structures in signal processing and data mining 
. However, considering the recent advances in AI, deep learning methods and non-linear models can also be imposed[17, 18]. Without loss of generality, the low-rank model is assumed in the present paper.
|Span||Span of the model (regular patterns are linear combination of these bases.)|
|Estimated rank||The number of basic patterns in the span.|
|Threshold||Threshold for the margin between trustworthy and outlier data.|
contains the training data to design the detector via rank decomposition model. The rank decomposition model can be constructed via singular value decomposition (SVD) ofas follows,
where, contains singular values of . Transpose of is indicated by . Moreover, columns of and
are right singular vectors and left singular vectors, respectively. The rank ofis upper bounded by = min. Rank of a matrix is equal to the minimum number of rank-one components that holds Eq. (1). However, the actual rank of a typical data could be much smaller than as the measurements from devices are dependent on each other. The goal of the outlier detection algorithm is to learn the possible dependencies and detect those patterns that do not agree with the rest of the measurements. By rejecting the inconsistent data, the blockchain network becomes AI-enabled. In AIBC, the devices associated with inconsistent data are excluded to go to the next consensus step. The low-rank approximation of the measurements can be written as follows,
where, is the best low-rank approximation of . Moreover, parameter can be estimated by analyzing the singular values. Rank is the minimum number of learned patterns such that data of all devices in a block can be represented as a linear combination of those patterns. Using the training data and the estimated rank, a margin around the model can be identified. A straightforward criterion for rank estimation is based on the Frobenius norm of residual.
Here, is a constant between and . As increases, the required rank decreases. Let us define matrix as the difference of and , i.e., . Each row of corresponds to the estimated perturbation of a device. The threshold for margin of each device is defined by , which is a function of the desired false alarm probability of detector. The parameters of the employed low rank model are summarized in Table I. The learned detector is characterized by . Vector is concatenation of .
The input data at time slot , , must be analyzed according to the learned detector by training data . First, should be projected on the span of the learned detector as:
where, is the projection of on the low-rank model. The residual of the projection is defined as the difference of the measured data and the projected data on the model, i.e., . This residual vector contains the mismatch of devices for time . The value of is the metric for decision on detecting outliers, i.e., if is greater than the threshold, , the measured data violates the margin of the model.
Iii-C Performance Analysis
A conventional hyperledger fabric network can tolerate up to malicious devices. However, proposed AIBC network can significantly increase this threshold by a carefully designed detector. There are two probabilities associated with an outlier detection algorithm: probability of detection () and probability of false alarm (). Probability of miss detection (complement of probability of detection) corresponds to the case that our algorithm fails to detect a malicious node. Probability of false alarm is associated with the scenario that the algorithm discerns an intact node as a malicious node. According to these two values, the fault tolerance of our proposed architecture can be improved in comparison with the naive PBFT consensus protocol. However, there exist the case that the performance of our architecture be less than which is only possible if the detector is not well-designed such that it has a very high probability of false alarm or a very low probability of detection.
The impact of the designed detector on the fault tolerance of the AIBC network is illustrated in the following inequality:
In this inequality, denotes the fault tolerance of the AIBC network over the filtered data using the designed detector. For successful execution of PBFT consensus in the second step consensus, should be less than . is the initial fault tolerance of our network in the first step before performing outlier detection which can be greater than . However, in a conventional hyperledger fabric, there is no detector and the fault tolerance (equivalent to in our AIBC network) should be less than . The effect of the detector on the threshold is investigated via implementation.
Iv Implementation Details and Results
We implemented the proposed 3-layer AIBC network using hyperledger fabric framework version 1.1.0. The code is written in chaincode using Golang. Following are the hardware specifications of the two laptops we used: Core i7-6500U processor, CPU 2.5 GHz 4, Ubuntu 18.04 LTS. First, we explain the architecture of the implemented network and the dataset we used. Then, we present the performance of the implemented AIBC network in terms of latency and the accuracy of the outlier detection algorithm.
Iv-a Network architecture and dataset
We have simulated the application layer with sensors and devices using Matlab. These devices send their data to the AIBC network in each time unit. The edge blockchain and core blockchain layers are simulated on two separate but similar laptops. Different components of the AIBC network including all the peers and orderer are defined in separate containers using Docker. These containers can communicate with each other through a channel. The containers in the two laptops are connected using Docker swarm.
The implemented chaincode has three main functions: init, invoke, and query. Init function is used for initializing the number of sensors in layer 1 and their names, and initializing the blockchain with the first input data from the devices. Invoke function is used to receive new data from devices in layer 1. Each time an application sends new data to an aggregator, invoke function is executed which sends back the result after endorsement to that application. Query function is used to obtain the current value of a device in the IoT network.
The detector is also implemented using Golang. It has init and invoke functions. Init function initializes the peers with number of devices and their names and initializes primary data model in AIBC. The primary data model is obtained through outlier detection algorithm from a synthesized dataset. We use a matrix where each column is new data obtained from an IoT network with devices. Invoke function is responsible for updating the data model through the outlier detection algorithm, detecting outlier data within a block, and discarding it. After a block is sent to the peers by the orderer, invoke function of the detector is executed on each peer for execution of step-1 of the consensus protocol. It will discard the outlier data from that block and send the result to the peer to initiate the step-2 of the consensus protocol.
Iv-B Network Latency
There are several delays for the invoke function of the designed detector: outlier detection algorithm delay, model update delay, dataset update delay, and devices state update delay. Outlier detection algorithm delay is the time spent to execute this algorithm to infer outlier data in the latest readings from devices in the Layer . Model update delay is the time it takes to update the model obtained by outlier detection algorithm. It should be mentioned that outlier detection is a machine learning algorithm in which the learned detector is updated at each time based on the observed data. The model is updated based on the last readings from different devices in Layer . Dataset update delay is the delay associated with updating the last 100 readings from all devices in the ledger. Devices state update delay is the time spent to update new values of the devices in the ledger.
We show the probability density function (pdf) of all the mentioned delays for our implemented IoT network withdevices in time slots in Figs. 4 and 5. A hyperledger fabric network reach consensus over a new block in milliseconds  and adding that block to the copy of ledger would take about microseconds according to Fig. 4(b). These delays exist in any hyperledger fabric implementation and are acceptable for a smart home network. Our designed detector would add some new delay components for the network. Outlier detection algorithm would take about milliseconds as shown in Fig. 4(a) which will not affect the performance of a smart home network. However, according to Fig. 5, model update delay and dataset update delay will take about and seconds respectively which is not acceptable for a smart home network. However, dataset update delay and model update delay can be easily eliminated if we use pre-learned data using a proper dataset. Therefore, our proposed architecture will incur an additional delay of about milliseconds for outlier detection which is acceptable for a smart home network since consensus is reached in milliseconds. The delays for different sections of our implementation are compared in Fig. 6.
Iv-C Two-step Consensus Protocol Accuracy
The last factor to evaluate our implementation is the accuracy of our proposed algorithm in terms of probability of detection and probability of false alarm and its impact on the fault tolerance of the proposed architecture. The fault tolerance for different probabilities of detection and false alarm is illustrated in Fig. 7(a). Fault tolerance of less than is denoted as fail zone which is only possible if the detector is designed very inexpertly with a significantly high probability of false alarm or low probability of detection. In general, the performance of the network is enhanced and the fault tolerance of the AIBC network for different detectors (different probability of detection and probability of false alarm) is found to be more than , more than , or more than as shown in Fig. 7(a).
Fig. 7(b) shows the accuracy of our algorithm on a synthesized dataset with large number of faulty devices. Although there exists a large number of malicious devices (outlier data) in the dataset, the detector is learned such that the fault tolerance of the network is more than . This is inferred by choosing any operating point in Fig. 7(b) and the corresponding point in Fig. 7(a). As an example, an operating point of the detector is shown at and . The fault tolerance of the network in this point is according to Inequality (5).
We proposed the AIBC network with a 2-step consensus protocol using outlier detection algorithm and PBFT. Outlier detection algorithm acts as the first step consensus and verifies the compatibility of new data and discards the suspicious ones in order to increase fault tolerance of the network for the second step consensus (PBFT). We measured the latency, accuracy, and performance of our method. Results reveal significant increase in the fault tolerance of hyperledger fabric by our detector. We employed an outlier detection scheme, however, recent advances in artificial intelligence such as deep learning and reinforcement learning can be exploited for designing the detector in a more robust manner.
-  T.M. Fernández-Caramés and P. Fraga-Lamas. ”A Review on the Use of Blockchain for the Internet of Things.” IEEE Access 6 (2018).
-  J. Lin, Z. Shen, and C. Miao. ”Using blockchain technology to build trust in sharing LoRaWAN IoT.” In Proceedings of the 2nd International Conference on Crowd Science and Engineering, pp. 38-43. ACM, 2017.
-  O. Novo. ”Blockchain meets IoT: An architecture for scalable access management in IoT.” IEEE Internet of Things Journal 5, no. 2 (2018).
-  Y. Dai, D. Xu, S. Maharjan, Z. Chen, Q. He and Y. Zhang. (2019). Blockchain and deep reinforcement learning empowered intelligent 5g beyond. IEEE Network, 33(3), 10-17.
-  M. Salimitari, M. Chatterjee, M. Yuksel, and E. Pasiliao. ”Profit maximization for bitcoin pool mining: A prospect theoretic approach.” In 2017 IEEE 3rd International Conference on Collaboration and Internet Computing (CIC), pp. 267-274. IEEE, 2017.
-  M. Salimitari, and M. Chatterjee. ”A survey on consensus protocols in blockchain for IoT networks.” arXiv preprint arXiv:1809.05613v4 (2018).
-  A. Reyna, C. Martín, J. Chen, E. Soler, and M. Díaz. ”On blockchain and its integration with IoT. Challenges and opportunities.” Future Generation Computer Systems 88 (2018): 173-190.
-  Z. Zheng, S. Xie, H. Dai and H. Wang. ”Blockchain challenges and opportunities: A survey.” Work Pap.–2016 (2016).
-  Z. Zheng, S. Xie, H. Dai, X. Chen and H. Wang. ”An overview of blockchain technology: Architecture, consensus, and future trends.” In 2017 IEEE International Congress on Big Data (BigData Congress), pp. 557-564. IEEE, 2017.
-  S. Gupta and M. Sadoghi. ”Blockchain Transaction Processing.” (2019): 1-11.
-  T. N. Dinh and M. T. Thai. ”Ai and blockchain: A disruptive integration.” Computer 51, no. 9 (2018): 48-53.
S. Dey. ”Securing majority-attack in blockchain using machine learning and algorithmic game theory: A proof of work.” In 2018 10th Computer Science and Electronic Engineering (CEEC), pp. 7-10. IEEE, 2018.
-  M. Udell, C. Horn, R. Zadeh and S. Boyd. ”Generalized low rank models.” Foundations and Trends® in Machine Learning 9, no. 1 (2016).
-  T. Baltrušaitis, C. Ahuja and L.-P. Morency. ”Multimodal machine learning: A survey and taxonomy.” IEEE Transactions on Pattern Analysis and Machine Intelligence 41, no. 2 (2018): 423-443.
-  M. Joneidi, P. Ahmadi, M. Sadeghi and N. Rahnavard. ”Union of low-rank subspaces detector.” IET Signal Processing 10, no. 1 (2016): 55-62.
-  A. Esmaeili and F. Marvasti. ”A novel approach to quantized matrix completion using huber loss measure.” IEEE Signal Processing Letters 26, no. 2 (2019): 337-341.
-  S. Li, M. Shao and Y. Fu. ”Multi-view low-rank analysis with applications to outlier detection.” ACM Transactions on Knowledge Discovery from Data (TKDD) 12, no. 3 (2018): 32.
-  H. Zhao, H. Liu, Z. Ding and Y. Fu. ”Consensus regularized multi-view outlier detection.” IEEE Transactions on Image Processing 27, no. 1 (2017): 236-248.