I Introduction
The Internet of Things (IoT) will interconnect a plethora of devices that collect data from their environment, process such data, and transmit valuable information to end-user applications such as health monitoring systems, drone delivery systems, self driving cars, and smart grids[1, 2, 3, 4, 5]. However, an effective deployment of these diverse IoT services requires an efficient intrusion detection system (IDS) that assures secure and reliable data transmission from IoT devices (IoTDs) by detecting anomalous activities such as sending data which is different from their normal state [5].
The use of IDS solutions has been studied extensively in IoT networks[6, 7, 8, 9, 10, 11, 12, 13]. For instance, the works in [6] and [7] proposed signature-based methods in which the network behavior is compared with a set of attack signatures which is stored in a database. Moreover, the authors in [8, 9, 10] have proposed anomaly-based IDSs in which the activities of a network are compared to the normal behavior of the system and an alarm is triggered whenever the deviation from the normal state exceeds a threshold. Furthermore, new techniques are proposed in [11, 12, 13] to compare the state of the system with predefined specifications such as the maximum capacity of links, packet size, or abstract rules based on IoT traffic.
However, due to several unique characteristics of IoT systems such as their scale and privacy-preserving nature, traditional IDSs such as those proposed in [6, 7, 8, 9, 10, 11, 12, 13] may not be effective. In traditional methods, the system administrator implements the IDS on computationally capable data centers or nodes that process the data from the entire IoT system or from a subset of neighboring nodes. However, due to the large-scale nature of the IoT, relying on centralized nodes to deploy IDS solutions exposes such nodes to attacks and, thus, eventually the rest of the IoT can also be vulnerable if the centralized IDS is compromised[14]. In addition, most IDSs require the availability of large amounts of data points related to the normal state of the network and failure scenarios[6, 7, 8, 9, 10, 11, 12, 13]. However, in IoT systems, each IoTD may have a dataset that only contains a limited portion of the network’s state. Moreover, in many applications such as health monitoring systems or financial activities, the end user may not intend to share its available dataset with the administrator of the system which makes the data-centered IDS solutions ineffective [5].
Recently, a number of distributed IDS and attack detection methods have been proposed for improving IoT security in [15, 16, 17, 18, 19, 20]. The authors in [15] and [16]
proposed anomaly detection approaches in which every node monitors its neighbors and, in case of abnormal behavior of a neighbor, breaks the data link from that neighbor. The authors in
[17] presented the challenges related to distributed private datasets and cloud-centric IoT systems. The works in [18] and [19]developed distributed deep learning algorithms for cyber attack detection in fog-to-things systems. In
[20], a clustering method has been proposed that combines the measurements of IoTDs before sending a compact description of their data to other nodes so as to minimize the communication overhead. However, the proposed IDSs in [6, 7, 8, 9, 10, 11, 12, 13] are centralized and not applicable for large scale IoT systems. In addition, the distributed IDS methods in [15, 16, 17, 18, 19, 20] cannot be applied to IoTDs that seek to preserve the privacy of their data.The main contribution of this paper is, thus, to propose a distributed generative adversarial network (GAN) architecture as an IDS for IoT systems. Recently, GANs emerged as an effective unsupervised anomaly detection approach in computer vision and time series applications
[21, 22, 23, 24, 25, 26]. However, these computer vision and time-series works consider a centralized GAN that has access to all of the data points. In contrast, in an IoT, a centralized GAN may cause communication overhead and increase the vulnerability of the IoT to the attacks on central units. The proposed distributed GAN-based IDS enables the IoTDs to monitor their neighbors in order to detect intrusion with minimum dependence on a central unit. In particular, the proposed distributed IDS can detect any type of attack that manipulates the integrity of the IoTDs such as bad data injection attacks. Moreover, we analytically show the superiority of the proposed distributed GAN-based IDS in terms of accuracy and intrusion detection probability compared to a single standalone GAN IDS. Simulation results show that, for a daily activity dataset
[27], the proposed distributed GAN-based IDS has up to 20% higher accuracy, 25% higher precision, and 60% lower false positive rate compared to a standalone GAN-based IDS. To the best of our knowledge, this paper is the first to propose a distributed GAN architecture as an IDS in privacy preserving IoT systems.Ii System Model
Consider an IoT system composed of a set of IoTDs in which each IoTD owns a set of its previously transmitted data points,, that follows a distribution , where can be time series, financial records, or health monitoring datasets depending on the IoT application. We consider contains data points from the normal state of the IoTD in which there is no intrusion to the IoT. We also let , where is the total available data with a distribution . In this model, every IoTD tries to learn a generator distribution over its available dataset such that and use that distribution to detect an intrusion to the system. The intrusion in our system is any activity by an attacker that causes an IoTD to communicate data points which do not follow its data distribution . In fact, if an IoTD knows the distribution of its own normal state, it can easily discriminate a data point that is not similar to the normal state distribution. To learn the distribution at every IoTD , we define a prior input noise with distribution and a mapping
from this random variable
to data space, whereis an artificial neural network (ANN) with parameters
. An ANN, usually, consists of artificial neurons and activation functions which map an input to an output. We define another ANN called
discriminator for every IoTD that receives a data point and outputs a value between and . When the output of the discriminator is closer to , then the received data point is a normal state and when the output is closer to then the received data is an anomaly at IoTD .Every IoTD ’s generator ANN tries to generate data points close to the normal state data in order to find the best approximation of . On the other hand, every IoTD’s discriminator aims at discriminating the generated data points from its own dataset. In fact, the generated data points mimic anomalous state of the system because they are generated from a distribution that is not equal to , because are randomly chosen for an untrained ANN. Thus, the discriminator outputs 0 values for the generator [28]. Therefore, the generator and discriminator at every IoTD interact to find the the optimal and such that the generator can generate data points close to the normal state while the discriminator can discriminate between the abnormal (anomalous) and normal data points. This interaction can be modeled using a game-theoretic framework [28] in which we define a local value function at every IoTD as follows:
(1) |
The value function (II), jointly quantifies how close is the generator’s generated data points to the normal state and how good the discriminator can discriminate between the normal and abnormal data points. In (II), the first term is defined to force the discriminator to produce values equal to 1 for real data. Meanwhile, the second term penalizes any anomalous point generated by the generators. While every IoTD’s generator will seek to minimize the value function defined in (II), the discriminator tries to maximize this value. Therefore, the optimal solutions for the discriminator and generator can be derived from the following minimax problem[28]:
(2) |
However, since every IoTD has access to only its own dataset, the optimization problem for a standalone IDS in which every IoTD has access only to its own dataset:
(3) |
where
(4) |
Next, we show that a standalone IDS cannot optimally detect intrusion. To this end, first, we derive the optimal value of a standalone IDS.
Theorem 1.
The optimal value for a standalone IDS will be:
(5) |
where is the Jensen-Shanon divergence between distributions and .
Proof.
To find the optimal strategies of the generator and discriminator of a standalone IoTD we have to solve (3). From [28, Theorem 1] we know that the solution of (3) is at . Now from (II) we have:
(6) |
For any , (6) reaches its minimum at . Since we know the standalone IoTD reaches its optimal point at , then we will have:
(7) |
which simplifies to (5). ∎
From [28, Theorem 1] we know that the optimal value for a centralized IDS which has access to all of the data points is . Thus, from Theorem 1, we can derive that the standalone IDS cannot minimize the value function as good as a centralized IDS that is connected to all the data points since the optimal value of a standalone IDS is larger than that of a centralized IDS by a value of . Next, we also show that the true positive rate, , i.e. correctly detected anomaly rate, of an standalone IDS is always less than a centralized IDS that has access to all of the data points.
Proposition 1.
The of a standalone IoTD anomaly detector is always less than an IoTD that has access to all of the data points.
Proof.
From [28, Proposition 1], we know that the optimal discriminator of a centralized IDS that has access to the full dataset is . Therefore, the difference between the discriminator of the standalone IoTD,, and the discriminator of a GAN with the full data, , shows the error of the standalone IoTD. Thus, the of the standalone IoT is complement of such error as follows:
(8) |
where (8) reaches its maximum when , which contradicts with the fact that . Therefore true positive ratio of an standalone IoTD is smaller than the IoT that has access to all data points. ∎
Although the standalone GAN may be able to detect internal intrusions, i.e., the IoTD can detect if an adversary has manipulated its own data, however, an attacker can also manipulate the IDS itself to stay stealthy at the attacked IoTD. One approach to stop the attacker from staying stealthy in internal intrusions can be to implement a centralized IDS that monitors all of the IoTDs. However, in this approach the communication overhead can be extremely high due to the large-scale nature of the IoT system. Moreover, in this case, the central IDS should have access to all of the IoTDs’ datasets which may not be practical in a privacy preserving IoT system. Furthermore, a centralized IDS makes the IoT system vulnerable to attacks on the central unit. Therefore, next, we propose a novel distributed GAN architecture that provides an effective IDS by adapting a mechanism in which every IoTD monitors the neighbor IoTDs.
Iii Distributed GAN-based IDS for IoT Systems
In our proposed distributed GAN-based IDS, we build on the architecture from [29]. The goal of our distributed GAN is to find a discriminator at every IoTD without sharing their datasets with each other such that every IoTD’s discriminator can discriminate if a new data point follows the total data distribution, . The main difference of the distributed IDS with the standalone IDS is that the standalone IDS learns to compare a new data point with its own data distribution, , however, in the proposed distributed IDS, every IoTD can compare a new data point with the distribution of the total data . Therefore, since in the distributed IDS every IoTD’s discriminator knows the distribution of the total data, then every IoTD can detect intrusion on other IoTDs as well.
In our distributed GAN, during only the training phase, we use a central unit that has a generator where is the weights of the generator’s ANN. Furthermore, every IoTD only has a discriminator denoted by where is the weights of every discriminator’s ANN. In this architecture, every IoTD is connected to at least one other IoTD in a wireless network such that the connection graph of the IoTDs must construct a cycle. Moreover, at the training phase, every IoTD is connected to the central unit. We define as the period of epochs at which IoTDs communicate with the center and
as the period of epochs at which the IoTDs communicate with each other. An epoch is a training session during which all of the data points have been used to update the ANN weights. Next, we explain the training phase for the distributed GAN-based IDS architecture as shown in Fig.
1.Iii-a Training of the Discriminators
Every epochs, for a fixed , the generator generates batches of anomalous points, i.e. fake points that are not derived from the actual datasets. Moreover, the discriminator at each IoTD samples a batch of points from its available dataset . The generator sends every generated batch to each IoTD which, in turn, calculates the following loss value:
(9) |
characterizes an approximation of the value function for each IoTD’s discriminator in (II). Then using a gradient descent method such as the Adam optimizer [30], the IoTD updates its own weights, .
Iii-B Training of the Central Generator
Every epochs, every IoTD uses to calculate the following loss:
(10) |
which is an approximation of the value function for each IoTD’s generator in (II). Note that, in [28], it has been shown that for practical implementation the generator’s loss should not include the term since discriminator usually converges faster than the generator. Next, every IoTD sends this value to the center. Then, the center uses the received loss values from IoTDs to calculate its average loss:
(11) |
Then, the center applies a gradient descent method on the generator to update its weight by minimizing .
Iii-C Discriminator Weight Swapping
Every epochs, every IoTD sends its discriminator’s weights to the neighbor IoTD and receives the discriminator weights of another IoTD. Note that, as mentioned before, the connection graph should include a cycle covering all of the IoTDs. This helps IoTDs to receive the discriminator weights of all of the other IoTDs, eventually. This phase enables the the system to preserve privacy during training because the IoTDs do not share their data but, instead, they share the weights of their discriminators. Using this approach, the generator receives the loss value from all of the discriminators. Moreover, every IoTD’s discriminator will be trained on all of the datasets. Therefore, for enough training epochs, the central generator will converge to the distribution of the total dataset . In addition, the discriminators of the IoTDs will be similar to a GAN discriminator that has access to the total dataset. Thus, in the distributed GAN architecture, each IoTD’s discriminator can detect intrusion on its own data as well as neighbor IoTDs.
Iii-D Intrusion Detection Phase
After the convergence of the distributed GAN, there will be no need for the central unit as all of the discriminators at IoTDs can detect the intrusion to the system. Thus, every IoTD IoTD will run its observed real-time data through its own discriminator as well as one of it’s neighbor’s discriminator. As shown in Proposition 1, the optimal discriminator will output for a normal state data point. Therefore, to detect an intrusion to the system, the output of the discriminator can be compared to and if the output is close to , the IoTD will be in a normal state. However, if the output is closer to 0 or 1 the IoTD will be under attack. This approach helps the IoT system to detect an intrusion to the system without dependence on a central unit since every IoTD can also check its neighbor’s data. The intrusion detection phase of the proposed distributed GAN-based IDS is shown in Fig. 1.
Iv Simulation Results
For our simulations, we use a daily activity recognition dataset [27] that is collected from 30 subjects with different genders, ages, heights, and weights using a smartphone. We use this dataset since it can be a good example of health datasets that are collected by wearable IoTDs. Such health datasets are private to their owners and, thus, the owners may not intend to share them. The collected data is for 12 activities as follows: walking forward, walking left, walking right, walking upstairs, walking downstairs, running forward, jumping, sitting, standing, sleeping, elevator up, and elevator down. The dataset contains 2,365 recordings in total and every recording in this dataset has 561 frequency and time domain features. We split this dataset into training and test datasets with a 4 to 1 ratio.
We consider three IDSs: a standalone GAN that has access only to its own dataset, a centralized GAN that has access to all datasets, and the proposed distributed GAN. We use ANN architectures similar to the ones in [28]
. To train the ANNs we use the Tensorflow library, a single NVIDIA P100 GPU, and 20 Gigabits of memory. We train the ANNs on the training datasets and stop the training after 5,000 epochs. To model the intrusion, we consider an attacker who can inject false data to the features of the training dataset. We consider 10 scenarios with different attack-to-signal power ratios in which a random Gaussian noise is added to every feature of the training data points. In addition, we consider an internal attack detection in which every IoTD checks its own data and an external attack detection in which every IoTD checks its neighbor IoTDs.
Fig. 2 shows the output of the discriminator during the training phase. We can see from 1(a) that the proposed IDS reaches the optimal discriminator value which is after convergence. However, the standalone and central IDSs reach values different from the optimal points. A standalone IDS has a non optimal discriminator output because it does not have access to the complete data. In addition, the central IDS also cannot reach the optimal discriminator since having access to all data points results in overfitting the ANN. As we can see from Fig. 1(a) the central IDS’s discriminator starts increasing around epoch 1000 after getting close to the optimal discriminator value. Such behavior of the central approach is aligned with the GAN-centric results of in [29]. Also Fig. 1(b) shows that the proposed distributed IDS reaches the optimal discriminator even for the test dataset, although such dataset has not been used in the training. This shows that, the discriminators of the distributed GANs are successful in approximating the total dataset and can discriminate even the unseen data points.
Fig. 3 compares the accuracy of three IDSs against internal and external attacks. The accuracy is calculated as , where is the true positive or the correctly detected anomaly ratio, is the false positive or the falsely detected anomaly ratio, is the true negative or the correctly assigned normal ratio, and is the false negative or the falsely assigned normal ratio. Fig. 3 shows that the proposed distributed IDS outperforms both of the central and standalone IDSs by up to 15% and 20% in terms of accuracy, respectively, for both the internal and external attacks. Note that the internal and external attacks for a central IDS are the same since there is just one central unit that discriminates the IoTD data points.
Fig. 4 shows the precision of the three IDSs against internal and external attacks with different attack to signal ratios. In IDSs, precision is calculated as and a precision score of 1.0 means that every datapoint labeled as intrusion is indeed an intrusion. We can see from Fig. 4 that the proposed distributed IDS has up to 25% and 20% higher precision compared to the standalone IDS and the central IDS, respectively. This means that the proposed distributed IDS has higher confidence of detecting intrusion to the IoT.
Fig. 5 shows the false positive rate of the IDSs which is calculated as . In IDSs, a false positive rate is the ratio between the number of normal state data points wrongly categorized as intursion () and the total number of actual normal state data points (). Note that, since and are calculated from the evaluation of normal data points, the power ratio of the attack will not affect the false positive rate. We can see from Fig. 5 that the proposed distributed IDS has significantly less false positive rate than the standalone and central IDSs. In addition, both distributed and standalone IDSs have slightly lower false positive rate for internal attacks compared to external attacks.
V Conclusion
In this paper, we have proposed a distributed GAN-based IDS solution that can detect intrusion to the IoT with minimum dependence on a central unit. In this architecture, every IoTD can monitor its own data as well as neighbor IoTDs to detect internal and external attacks. In addition, the proposed distributed IDS does not require sharing the datasets between the IoTDs, thus, it is practical to implement in IoTs that preserve the privacy of user data such as health monitoring systems or financial applications. We have shown analytically that the proposed distributed IDS can outperform a standalone IDS that has access only to the dataset of a single IDS. Simulation results that use a real-world a daily activity recognition dataset show that the proposed distributed GAN-based IDS has up tp 20% higher accuracy, 25% higher precision, and 60% lower false positive rate compared to a standalone IDS.
References
- [1] M. Mozaffari, W. Saad, M. Bennis, Y. Nam, and M. Debbah, “A tutorial on uavs for wireless networks: Applications, challenges, and open problems,” IEEE Communications Surveys Tutorials, 2019.
- [2] A. Ferdowsi and W. Saad, “Deep learning-based dynamic watermarking for secure signal authentication in the internet of things,” in IEEE International Conference on Communications (ICC), Kansas City, MO, USA, May 2018, pp. 1–6.
- [3] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of things: A survey on enabling technologies, protocols, and applications,” IEEE Communications Surveys Tutorials, vol. 17, no. 4, pp. 2347–2376, June 2015.
- [4] W. Saad, M. Bennis, and M. Chen, “A vision of 6g wireless systems: Applications, trends, technologies, and open research problems,” arXiv preprint arXiv:1902.10265, 2019.
- [5] B. B. Zarpelão, R. S. Miani, C. T. Kawakani, and S. C. de Alvarenga, “A survey of intrusion detection in internet of things,” Journal of Network and Computer Applications, vol. 84, pp. 25 – 37, 2017.
- [6] H.-J. Liao, C.-H. R. Lin, Y.-C. Lin, and K.-Y. Tung, “Intrusion detection system: A comprehensive review,” Journal of Network and Computer Applications, vol. 36, no. 1, pp. 16 – 24, 2013.
- [7] C. Liu, J. Yang, R. Chen, Y. Zhang, and J. Zeng, “Research on immunity-based intrusion detection technology for the internet of things,” in Proc. of the International Conference on Natural Computation, vol. 1, Shanghai, China, July 2011.
- [8] R. Mitchell and I.-R. Chen, “A survey of intrusion detection techniques for cyber-physical systems,” ACM Comput. Surv., vol. 46, no. 4, pp. 55:1–55:29, Mar. 2014.
- [9] T.-H. Lee, C.-H. Wen, L.-H. Chang, H.-S. Chiang, and M.-C. Hsieh, “A lightweight intrusion detection scheme based on energy consumption analysis in 6lowpan,” in Advanced Technologies, Embedded and Multimedia for Human-centric Computing, Y.-M. Huang, H.-C. Chao, D.-J. Deng, and J. J. J. H. Park, Eds. Dordrecht: Springer Netherlands, 2014, pp. 1205–1213.
- [10] D. H. Summerville, K. M. Zach, and Y. Chen, “Ultra-lightweight deep packet anomaly detection for internet of things devices,” in Proc. of IEEE 34th International Performance Computing and Communications Conference (IPCCC), Nanjing, China, Dec 2015, pp. 1–8.
- [11] S. Misra, P. V. Krishna, H. Agarwal, A. Saxena, and M. S. Obaidat, “A learning automata based solution for preventing distributed denial of service in internet of things,” in Proc. of the International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing, Dalian, China, Oct 2011.
- [12] A. Le, J. Loo, K. K. Chai, and M. Aiash, “A specification-based ids for detecting attacks on rpl-based network topology,” Information, vol. 7, no. 2, 2016.
- [13] J. P. Amaral, L. M. Oliveira, J. J. P. C. Rodrigues, G. Han, and L. Shu, “Policy and network-based intrusion detection system for ipv6-enabled wireless sensor networks,” in Proc. of IEEE International Conference on Communications (ICC), Sydney, NSW, Australia, June 2014, pp. 1796–1801.
- [14] A. Ferdowsi and W. Saad, “Deep learning for signal authentication and security in massive internet-of-things systems,” IEEE Transactions on Communications, vol. 67, no. 2, pp. 1371–1387, Feb 2019.
- [15] N. K. Thanigaivelan, E. Nigussie, R. K. Kanth, S. Virtanen, and J. Isoaho, “Distributed internal anomaly detection system for internet-of-things,” in Proc. of 13th IEEE Annual Consumer Communications Networking Conference (CCNC), Las Vegas, NV, USA, Jan 2016, pp. 319–320.
- [16] C. Cervantes, D. Poplade, M. Nogueira, and A. Santos, “Detection of sinkhole attacks for supporting secure routing on 6LoWPAN for Internet of Things,” in Proc. IFIP/IEEE International Symposium on Integrated Network Management (IM), Ottawa, ON, Canada, May 2015, pp. 606–611.
- [17] I. Butun, B. Kantarci, and M. Erol-Kantarci, “Anomaly detection and privacy preservation in cloud-centric internet of things,” in Proc. of IEEE International Conference on Communication Workshop (ICCW), London, UK, June 2015, pp. 2610–2615.
- [18] A. Abeshu and N. Chilamkurti, “Deep learning: The frontier for distributed attack detection in fog-to-things computing,” IEEE Communications Magazine, vol. 56, no. 2, pp. 169–175, Feb 2018.
- [19] A. A. Diro and N. Chilamkurti, “Distributed attack detection scheme using deep learning approach for Internet of Things,” Future Generation Computer Systems, vol. 82, pp. 761 – 768, 2018.
- [20] S. Rajasegarar, C. Leckie, and M. Palaniswami, “Hyperspherical cluster based distributed anomaly detection in wireless sensor networks,” Journal of Parallel and Distributed Computing, vol. 74, no. 1, pp. 1833 – 1847, 2014.
- [21] M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, and N. Sebe, “Abnormal event detection in videos using generative adversarial nets,” in Proc. IEEE International Conference on Image Processing (ICIP), Beijing, China, Sep. 2017, pp. 1577–1581.
- [22] M. Ravanbakhsh, E. Sangineto, M. Nabi, and N. Sebe, “Training adversarial discriminators for cross-channel abnormal event detection in crowds,” in Proc. IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, USA, Jan 2019, pp. 1896–1904.
- [23] T. Schlegl, P. Seeböck, S. M. Waldstein, U. Schmidt-Erfurth, and G. Langs, “Unsupervised anomaly detection with generative adversarial networks to guide marker discovery,” in Information Processing in Medical Imaging, M. Niethammer, M. Styner, S. Aylward, H. Zhu, I. Oguz, P.-T. Yap, and D. Shen, Eds. Boone, NC, USA: Springer International Publishing, 2017, pp. 146–157.
- [24] H. Zenati, C. S. Foo, B. Lecouat, G. Manek, and V. R. Chandrasekhar, “Efficient gan-based anomaly detection,” arXiv preprint arXiv:1802.06222, 2018.
- [25] Y. Intrator, G. Katz, and A. Shabtai, “Mdgan: Boosting anomaly detection usingmulti-discriminator generative adversarial networks,” arXiv preprint arXiv:1810.05221, 2018.
- [26] D. Li, D. Chen, J. Goh, and S.-K. Ng, “Anomaly detection with generative adversarial networks for multivariate time series,” arXiv preprint arXiv:1809.04758, 2018.
- [27] “Transition-aware human activity recognition using smartphones,” Neurocomputing, vol. 171, pp. 754 – 767, 2016.
- [28] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, 2014, pp. 2672–2680.
- [29] C. Hardy, E. L. Merrer, and B. Sericola, “Md-gan: Multi-discriminator generative adversarial networks for distributed datasets,” arXiv preprint arXiv:1811.03850, 2018.
- [30] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
Comments
There are no comments yet.