Security Analysis of Big Data on Internet of Things

08/28/2018 ∙ by S. Banaeian Far, et al. ∙ Islamic Azad University 0

The volume of data exchange in this network is ascending, by expansing of internet of things and its increasing familiarity in recent years. By increasing of requests for joining to the network and taking advantage of its services, necessity to maintain the privacy and security is felt more than ever and the whole thing has changed into a challenge. Keeping the security of the users in small networks seems to be simpler and the threats more predictable. As the users of the network which are sensors, the volume of data increases as well. As a result the routers of the network and network servers has become a seriously challenging task to control and maintain the security for this volume of data. We analyze these challenges in this paper. This paper aims to maintain the security of the users and present procedures to dominate the problems and express the strategies to overcome them. Some methods such as using both encryption function and secure protocol are proposed. Other method to overcome over big data challenges is smart design of communication protocol. Finally, we point out some future challenges of big-data on the internet of thing network.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Not only the number of individuals connected to the internet increases day by day, but also the same thing happens to the things connected to the internet to exchange data and interact with each other. In modern era, smart devices and things could transmit and receive information as well as interact with each other. This interaction has reached a level that a major volume of data exchange through things is related to Internet of Things (IoT). It is forecasted that the number of smart things connected to internet in the year is going to increase times comparing to [1]. The sensors could be enumerated as the dominant things connected to internet exchanging data, considered as one of the greatest resources of data [2]. IoT shall be considered as a smart network consisting of things interacting with each other [3].
Things in the network produce and exchange data having a great volume, velocity, and variety. A term titled big data

(BD) has been coined for the data embodying the above mentioned characteristics. BD has no clear boundary and is usually classified based on its specifications

[4, 5].

  • Related work: In , the importance of BD was investigated in the industries and got to the conclusion that complexity is the most important challenge of BD [6]. Another survey carried out in the same year, considered the sensors and radio frequency identification systems (RFID) as the most important resources for the generation of BD [7]. In , Jacobik investigated some security challenges and privacy settings of the BD [8]. In the same year Ye et al., investigated and presented a classification of attacks in various sectors of BD. In . Khan presented the utilization of the fourth generation of industrial revolution of BD [9]. In this scheme, the IoT which is one of the generating factors of BD plays an important role which has been shortened as Industrial Internet of Thing (IIoT). In the same year, Sivaraja has expounded some challenges related to data, processing, data management, and afterwards has investigated some types of BD [10].

  • Our contribution: To overcome on BD challenges, we suggest several methods, as follows:

    • Using encryption functions

    • Network security

    • Protecting programs

    • Using secure protocol

    • Using from lightweight encryption functions

    • Smart design of communication protocols

    In the section , we discuss about the mentioned methods.

  • Paper organization: This paper, the pre-requisites of work security and communication in IoT and also BD would be expounded after presenting some required definitions in the section and further scientific familiarization with the fundamentals of IoT principles as well as work security and communication in the network of IoT and also, the BD. Afterwards, in the section of the article the existing challenges in BD communication in IoT would be expounded. In the section of the article, some methods to overcome the challenges would be expressed from the viewpoint of network security. Finally, after the challenges facing to BD communication in IoT would be expressed.

Ii Preliminaries and Definitions

Before entering detailed discussion of IoT more familiarization is needed. Concepts such as the network we discuss, instruments utilized or cases which faced with. In continuation, a brief description of IoT is needed to be explained.

Ii-a Internet of Things

IoT is said to all things which are able to transfer or receive information and could interact with each other. IoT could be termed as an expanded and pervasive in all fields. This network is capable of gathering, processing, transmitted data analyzing of all network elements. IoT predicts that in future the world would be controlled by physical things while are connected to each other through a single infra-structure3 [3]. In other words, IoT is a giant network in which a huge mass of things, sensors, and smart devices are interactive with each other [11]. Most of this network is made up by sensors. But, there are other entities like , smart card, smart phone, and the computers which are share the required information.

Ii-B Big Data

No clear and transparent definition could be provided from BD [5]. But as evident from the term big data which refers to data containing high volumes of data, the most important point is the volume of data [13]

. Many definitions have been presented which are dependent upon the objective and research area. At the moment,

billion individuals are connected to internet and billion use cellular phones. It is expected that the number of devices connected to the internet exceeds billion devices by . It is predicted that the number of messages transmitted by this year () exceeds times the number of messages transmitted by , the imagination of this volume of change is thoroughly difficult. The rate of BD is not clear, but some BD producers are You-tube, Facebook and Twitters [1]. Table I shows a number of other BD producers.
Nowadays, wireless sensor networks could be considered as BD producers as well [2, 7]. Predict that IoT network has the largest shared document, in future.

Source Data Type and Size
YouTube Upload over
billion access
Download billion
Facebook billion likes
Upload TB of data
billion of users
Twitter billion users
billion twits
Google billion users
Google million search
Process PB of data
Apple application download
TABLE I: Some big data producers [1]

Ii-C Cloud Server and Cloud Computing

The information systems based on IoT are stored and processed in servers which have ultra-high strength and power. Cloud servers collect their information from sensors, s and other smart devices and store them on their memory [12]. Cloud computations bore growth from industry and the powerful servers of companies like Google and Amazon supported these servers [11]. The need for cloud servers in IoT network is strongly felt, since this network involves light devices and they can not process and store BD.

Ii-D Big Data Application on Internet of Things

The role of BD in IoT or the role of IoT in BD In order to reply this question it could somehow be told that the two phrases are interdependent together. The clear sample of BD could be observed through BD in IoT network. Also, several years ago, social networks such as Facebook were considered as giant producers of BD, and the social media have still a noticeable role, but predictions show that in a near future, IoT gains the highest share of exchanged information. As mentioned above, future attitude toward IoT focuses on large-scale smart cities [14]. For example, in , Sun presented a smart city in four parts: Viability, protection, Rejuvenation, and Stability. In this scheme, the lowest level of architecture appropriate for rendering services is transmission of information through sensors. After investigating information in the scheme, the time for client services is announced.
From the other applications is the transfer of industrial information. The fourth industrial revolution in the world in was the apex of IoT usage. Before that it was through robots, computers and chips. It was here that IoT entered to assist industry. The sensors which have the highest share in IoT started to increase [15]. Along with this increase, the volume of exchanged controlled information ascended as well.
Another connection between IoT and BD, gathering and environmental information, GIS and astronomy through wireless sensors of IoT. With the increase of IoT, this information. Ascends as well [16].
Philip Chen has expounded another type of BD dependence to IoT [13]. Navigation, Social media, financial information, Health information, astronomy information, and smart transportation are among industries which produce lots of information.

Iii Requirements

With demand increase in social media and information. Exchange in these networks, and also the expansion and pervasive IoT in all fields, the malicious and fraudulent individuals who intended to cheat started to increase as well. Therefore, to protect users (individuals/things) whose security is of prime importance started to protect their information [8, 17, 18]. So, the two important factors are as follows:

Iii-a Privacy

Users’ private security is indicate the level of accessibility to other users. People can access to others’ private information [19]. Some researchers have divided the privacy information into three sections of: infrastructure security, information security and information management [9]. Each user (individual or device) can give access to its sensitive information to others. However, it can limit the access and closes its privacy.

Iii-B Security

High volume and velocity makes security difficult. This could be investigated from. Users, service rendering centers and computer networks. BD security is quite complicated and still not thoroughly known. Also, technology speed and information volume makes it more difficult. BD security is quite complicated and still not thoroughly known. Also, technology speed and information volume makes it more difficult [8].

Lots of worries exist on information Security and data. As data are stored in voluminous spaces, the accesses of fraudulent people is probable. Giant Companies like Google, Microsoft, YouTube, Skype, etc. Have tried to devise various information Security methods which shows how perilous security protection is in giant servers

Security threats consist of several sections. Dennial of service (DoA) attack is formed in the infrastructure, and attack on encryption function and access control are formed in privacy section [9]. Preventing users against these attacks are security challenges.

Iv Threats and Challenges

This is the most important part of our paper. Since it deals with the challenge of protection. In this section could be divided into gathering of BD, BD analysis and use of BD. These include: subdivisions as BD challenge, processing challenge and BD management challenge [21]. Please note that: We have pointed out to current challenges. In the section focuses on how to overcome on challenges related to security and privacy [4, 6, 8, 10, 13, 20]. We describe some features of BD in the blow:

  • Volume: Volume of data (terra byte and more) is a great challenge in BD. The diversity of the type of information environmental information medical info and business information. Facebook, for example produces terra byte data every day. As it is known from the name BD, the volume counts. We do not set boundaries for the BD, but it is not low-scale data.

  • Variety: The data are different. For example the corresponded data by sensors are many variation (e.g. environment information, sound, image, data, and even noise).

  • Velocity: As mentioned before pointing to diversity and complexity of data structure, if high speed and voluminous data is added, processing becomes a challenging job. Velocity is another feature of BD and its challenge. The processing of high-velocity data is a challenge.

  • Variability: Usually users transmit different data. Google, for instance, receives diverse data from different users and different sources. in some other resources variability of data is among the four original challenges. We depict a in the figure I.

    Fig. 1: The original challenges of big data (4v) [16]
  • Veracity: The structure of data is quite complicated and in BD confidence cannot be established.

  • Visualization: Information shall be understandable and eligible. From every sender key information shall be available. For example, eBay has many clients whose info have to be received.

  • Value: The value of data shall be kept and maintained. None of it shall be deleted. It is very important that the value of data does not lost/change.

  • Data acquisition: This is a challenging job related to information. Gathering from different sources and keep them secure.

  • Data Mining and Cleansing: One of the greatest challenges, is to cleanse the giant pool of info from additional data and choice of correct data required.

  • Data aggregation and integration: This challenge shall be explained by one example: Imagine the giant social media: Twitter. You should answer any tweet by a retweet. Finding the reply to a tweet in this great volume of data. The most important is the answer to the right question which is the duty of twitter server.

  • Complexity: As clear from other explanations, as data volume increases, complexity boosts as well. This could be divided into three parts 1)data complexity, 2)computational complexity, and 3)system complexity.

  • Data analysis and modelling: This challenge points out to some type of sub sets: separation of information, and gathering lost data.

  • Data interpretation: This step looks like image building step important for received data decision.

  • Privacy: As mentioned before, privacy is the most challenging task in digital age. In the section , we discussed the importance of this feature.

  • Security: Keeping users’ privacy and preventing from spread of malware. According to the importance of processing of BD, their security should also be considered. It means that the attacker and malicious user have no ability to access the users’ sensitive information.

  • Data governance: As BD is increasing, companies and Organizations shall potentially try to manage the data properly, data quality guarantee, data quality improvement and keeping the value of the data is among key elements in information.

  • Data and information sharing: All Organization, Shall coordinate in sharing information.

  • Cost: Data cost are increasing as demands increases. All throughout the world client data shall be supported. For example, Google supports its clients in centers around the world.

  • Data Ownership: In addition to security and privacy, data ownership is important as well, which shows itself when sharing.

V Methods to Overcoming on Challenges

As discussed, BD has a lot of challenges such as 1)security and privacy, 2)data management, and 3)processing challenges.
In this paper, we outline the security and privacy implications of some ways to overcome on these challenges in the Internet of Things. The most basic and primitive goal in security and privacy is to provide all three features of Confidentiality, Antegrity, and Authentication (CIA) [17].

Using encryption functions

Encountering giant networks like IoT functions and protocols shall be used to comply with all things. Many devices have limited processing power and cannot be encrypted through keys. Many devices have low power. But today many methods are devised for encrypting which is both speedy and secure and low cost [10, 17].

Network security

Protection techniques are vital [10].

Protecting programs

Program protection is vital, for there are attacks to steal information. There are also many IoT programs which are still nascent in , digital signature and Each has strength and weaknesses. Sometimes through simple changes the program could be practically useful [10].

Using secure protocol

To protect users’ privacy and protect them on the network, secure protocols (and lightweight) should be used. There are many protocols for transferring information, user authentication, digital signatures, and so on. Each of them is presented for the purpose of the properties, each of which has strengths and weaknesses that can be secured by analyzing them in detail. Sometimes, with a slight change, the use of a communication protocol can be changed, or by modifying a protocol with a simple technique, it can be used in practice.

Using from lightweight encryption functions

Using cryptographic functions is one of the best ways to protect information. But since the future targets are clear, the largest share of future information belongs to sensors and RFIDs. Therefore, the use of light cryptographic functions is very important. Since most of the elements on the Internet are objects of sensors or smart cards, they can be a symmetric cryptographic key in their production plant.

Smart design of communication protocols

Lightweight protocols and the use of cryptographic functions of the network security network will not always be used, and sometimes it is necessary to use asymmetric encryption, digital signing, and other encryption functions that greatly increase the computing load of the network. Therefore, it is necessary to design a protocol that is the main burden of its calculation on the server servers and users (objects) is powerful. So that poor network elements such as sensors carry less computational burden.

Vi Conclusion

In this paper, we reviewed and analyzed some of the projects on the IoT with the BD mining approach and we have reviewed the safety and privacy of the users. Eventually, we used several methods, such as the use of encryption functions and protection programs. It is absolutely clear that encryption functions Bring confidentiality. After that, we proposed methods that could provide a safer space for low-power users. The most important of our proposed methods is the smart design of communication protocols, so that the computing and processing load will be borne by service providers (cloud servers). In this method, the light things (e.g. sensors) apply light operation. But, on the other side of protocol, the server uses the same or other cryptographic function. So, the low-power things can run the protocol and be safe.

Vii Future Works

Given the ever-expanding Internet network of objects in all areas, researchers are thinking of the future of the network. We know that the most important part of the data is large data aggregation, which should be processed after processing and other actions on them. We should know that the most exchanged data of this network is more than that group in the future. Some scholars will map out the future generations of macroeconomic data into 1) online data networks, 2) cellphone and Internet data objects, 3) geographic information, 4) temporary space data, And 5) flow data and many other data [22]. Methods must be considered for the collection, processing and classification of these categories.
One of the most important parts of the today world is industrialize. The industrialization of societies is on the risk, and this trend will continue, and people will be replaced by cars. The vast Internet of objects in the industry can not be ignored, and in the industry, the Internet of Things will play a significant role. The expansion of the industry, the spread of information, as well as the increasing of data sent by the agents in it. Sensors, drives and other components can be referred to the factors that are constantly sending and receiving data [15]. Securing this volume of equipment and devices for sending and receiving information is very important. Failure to send and receive one of these data may result in a serious problem with production. Therefore, maintaining the security and health of industrial data in the present and future will be very important.
It is anticipated that the number of objects of the Internet service provider will reach a billion by 2030, so companies in this field will be researching. According to HP and Intel, large data management and processing should have three features. First of all, there should be a lot of powerful and high-capacity terminals for data gathering, second, the data produced by the Internet elements of objects is not often complete, and it should be possible to analyze them properly. Finally, the information gathered from the elements of the Internet of objects is only effective when analyzed [16].
The use of electronic devices is increasing day by day, and as a result, all communities try to make the environment more intelligent and use energy efficiently. Increasing the use of electronic devices (sensors, smart devices and even cloud servers) will require more privacy. In Section 5, we used one of the ways to protect information and privacy using cryptographic functions. An important point is using its computational load encryption functions. Designers should be careful about the privacy of users (information has been sent and received by objects or individuals) and the optimal use of energy in designing Internet systems and service servers.

Conflict of Interests
The authors declare that they have no conflict of interests.


  • [1] Khan, Nawsher, et al. ”Big data: survey, technologies, opportunities, and challenges.” The Scientific World Journal 2014 (2014).
  • [2] Ding, Xuejun, Yong Tian, and Yan Yu. ”A real-time big data gathering algorithm based on indoor wireless sensor networks for risk analysis of industrial operations.” IEEE transactions on industrial informatics 12.3 (2016): 1232-1242.
  • [3] Rahman, Abdul Fuad Abdul, Maslina Daud, and Madihah Zulfa Mohamad. ”Securing sensor to cloud ecosystem using internet of things (iot) security framework.” Proceedings of the International Conference on Internet of things and Cloud Computing. ACM, 2016.
  • [4] Gani, Abdullah, et al. ”A survey on indexing techniques for big data: taxonomy and performance evaluation.” Knowledge and Information Systems 46.2 (2016): 241-284.
  • [5] C. Perera, R. Ranjan and L. Wang, ”Big Data Privacy in Internet of Things Era,” Internet of Things (Mag), pp. 32 - 39, 2015.
  • [6] Jin, Xiaolong, et al. ”Significance and challenges of big data research.” Big Data Research 2.2 (2015): 59-64.
  • [7] Kang, Yong-Shin, et al. ”MongoDB-based repository design for IoT-generated RFID/sensor big data.” IEEE Sensors Journal 16.2 (2016): 485-497.
  • [8] A. Jakobik, ”Big Data Security,” Computer Communications and Networks, vol. 12, pp. 241 - 261, 2016.
  • [9] Ye, Haina, et al. ”A survey of security and privacy in big data.” Communications and Information Technologies (ISCIT), 2016 16th International Symposium on. IEEE, 2016.
  • [10] Sivarajah, Uthayasankar, et al. ”Critical analysis of Big Data challenges and analytical methods.” Journal of Business Research 70 (2017): 263-286.
  • [11] Hashem, Ibrahim Abaker Targio, et al. ”The rise of “big data” on cloud computing: Review and open research issues.” Information Systems 47 (2015): 98-115.
  • [12] Cai, Hongming, et al. ”IoT-based big data storage systems in cloud computing: Perspectives and challenges.” IEEE Internet of Things Journal 4.1 (2017): 75-87.
  • [13] Chen, CL Philip, and Chun-Yang Zhang. ”Data-intensive applications, challenges, techniques and technologies: A survey on Big Data.” Information Sciences 275 (2014): 314-347.
  • [14] Sun, Yunchuan, et al. ”Internet of things and big data analytics for smart and connected communities.” IEEE Access 4 (2016): 766-773.
  • [15] Khan, Maqbool, et al. ”Big data challenges and opportunities in the hype of Industry 4.0.” Communications (ICC), 2017 IEEE International Conference on. IEEE, 2017.
  • [16] Chen, Min, Shiwen Mao, and Yunhao Liu. ”Big data: A survey.” Mobile Networks and Applications 19.2 (2014): 171-209.
  • [17] Cheng, Chi, et al. ”Securing the Internet of Things in a Quantum World.” IEEE Communications Magazine 55.2 (2017): 116-120.
  • [18] Dubey, Arunima, and Satyajee Srivastava. ”A Major Threat to Big Data: Data Security.” Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies. ACM, 2016.
  • [19] A. Adrian TOLE, ”Big Data Challenges,” Database Systems Journal, vol. 4, pp. 31 - 41, 2013.
  • [20] Bertino, Elisa, and Elena Ferrari. ”Big data security and privacy.” A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Springer International Publishing, 2018. 425-439.
  • [21] Jung, Jason J. ”Computational Collective Intelligence with Big Data: Challenges and Opportunities.” (2017): 87-88.
  • [22] Song, Houbing, et al. ”Next-generation big data analytics: State of the art, challenges, and future research topics.” IEEE Transactions on Industrial Informatics (2017).
  • [23] N. Ammu and M. Irfanuddin, ”Big Data Challenges,” Special Issue of ICACSE 2013 - Held on 7-8 January, 2013 in Lords Institute of Engineering and Technology, Hyderabad, vol. 2, pp. 613 - 615, 2013.