DeepAI
Log In Sign Up

Effective Anomaly Detection in Smart Home by Integrating Event Time Intervals

Smart home IoT systems and devices are susceptible to attacks and malfunctions. As a result, users' concerns about their security and safety issues arise along with the prevalence of smart home deployments. In a smart home, various anomalies (such as fire or flooding) could happen, due to cyber attacks, device malfunctions, or human mistakes. These concerns motivate researchers to propose various anomaly detection approaches. Existing works on smart home anomaly detection focus on checking the sequence of IoT devices' events but leave out the temporal information of events. This limitation prevents them to detect anomalies that cause delay rather than missing/injecting events. To fill this gap, in this paper, we propose a novel anomaly detection method that takes the inter-event intervals into consideration. We propose an innovative metric to quantify the temporal similarity between two event sequences. We design a mechanism to learn the temporal patterns of event sequences of common daily activities. Delay-caused anomalies are detected by comparing the sequence with the learned patterns. We collect device events from a real-world testbed for training and testing. The experiment results show that our proposed method achieves accuracies of 93 88

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

06/26/2019

Visual Anomaly Detection in Event Sequence Data

Anomaly detection is a common analytical task that aims to identify rare...
02/26/2020

Wavelet-based Temporal Forecasting Models of Human Activities for Anomaly Detection

This paper presents a novel approach for temporal modelling of long-term...
08/31/2020

Multi-Scale One-Class Recurrent Neural Networks for Discrete Event Sequence Anomaly Detection

Discrete event sequences are ubiquitous, such as an ordered event series...
09/29/2021

Smart-home anomaly detection using combination of in-home situation and user behavior

Internet-of-things (IoT) devices are vulnerable to malicious operations ...
07/31/2019

MSNM-S: An Applied Network Monitoring Tool for Anomaly Detection in Complex Networks and Systems

Technology evolves quickly. Low cost and ready-to-connect devices are de...
04/05/2021

Semi-supervised Variational Temporal Convolutional Network for IoT Communication Multi-anomaly Detection

The consumer Internet of Things (IoT) have developed in recent years. Ma...
06/25/2018

SAQL: A Stream-based Query System for Real-Time Abnormal System Behavior Detection

Recently, advanced cyber attacks, which consist of a sequence of steps t...

I Introduction

In recent years, smart home IoT systems become increasing popular in the consumer market. In US, it is reported that smart home IoT devices have entered 43% households in 2021 [12]. Conveniences brought by smart home platforms, like Alexa and Google Home, also incentive common users to automate their life with smart appliances and sensors.

However, security and safety concerns also raise along with the prevalence of smart home IoT devices [5]. Due to limitations on cost and power supply, smart home IoT devices are long known to be unreliable and vulnerable to cyber-attacks, which allows attackers to impose threats to users’ safety in the physical world [10]. Besides deliberate attacks, device malfunctions [7] happen even more frequently as some devices are working in harsh environment (e.g., moisture in bathrooms). These device attacks and faults, if not dealt timely, could cause severe damages. For example, with the help of automation rules [8], an electrical heater could be turned on by low temperature readings from temperature sensors. False temperature readings that are either caused by sensor malfunctions or attacker’s malicious modification could trigger the heater staying ‘on’ for long time, which significantly increase the risk of fire.

To cope with these security issues, there has been many research works [11, 6, 4, 14] that aim to automatically detect anomalous status of smart home systems. These works model the normal patterns of users’ daily activities in the form of invariant event sequences or causal association rules and report deviant patterns as anomalies. For instance, the the event sequence “presence detected”, “lock unlocked”, “front door opened”, “hallway motion active”, “front door closed”, “hallway motion inactive” should be observed when a user goes back to home. Anomaly alarms are raised when the order of this event sequence is violated (e.g., the door gets unlocked without a prior event of presence sensor).

However, existing works only focus on the order of events but does not take the events’ temporal information into consideration. In some anomalous cases, the sequences of events remains identical to the learned one but with certain events being delayed.

In the example case of user going back to home, If intruder get the presence sensor and enter the house, the event sequence could also be “presence detected”, “lock unlocked”, “front door opened”, “hallway motion active”, “front door closed”, “hallway motion inactive”, but the time-interval between each two event is different. For instance, while user closes the door immediately after enters the hallway, intruder may leave the door open until he leaves the house. The difference in time-interval is the key to identify anomalous case.

In this paper, we fill this gap by proposing a novel anomaly detection method that takes events’ temporal information into consideration. With the additional information, we are able to more accurately profile normal behaviors of smart home IoT systems. More specifically, we enhance the existing event patterns by adding intervals between each two events in the sequence. In addition, we design an innovative scoring mechanism to compare the temporal similarity between two event sequences. Intervals that are either too small or too large will result in low or high score and trigger anomaly alarms.

To evaluate the proposed method, we collect the event sequences of user’s daily activity in a real-world testbed over 10 days. We measure 3 activities that include “Come back home”, “Go to work”, “Use toilet”. The evaluation results show that the accuracy for three activity is 93%, 88%, 89%, respectively.

We make the following contributions:

  • We propose a new learning model that can learning time-interval from event sequence. With the help of this new learning model, we can better profile user behaviors.

  • We propose a scoring method for event sequence, which gives a score based on the similarity between a test event sequence and the ground truth. The score result is used to decide whether to trigger an anomaly alarm or not.

  • We evaluate the new learning model and scoring method with a real-world testbed and three activities. The result shows that the accuracy for each activity is 93%, 88%, 89%, respectively.

The rest of the paper is organized as follows. We introduce related work in Section II. In Section III, we describe background about smart home platform. We introduce motivation and threat model in Section IV. In Section V, we introduce the design of anomaly detection system. The evaluation is presented in Section VI. Conclusion is presented in Section VII.

Ii Related work

With the rapid development of IoT devices and smart home, the security issue in smart home draws much attention and there has been many papers focusing on the anomaly detection in smart home[11, 6, 4, 3]. Most of them are based on the order of events. For example, AEGIS observes the change in device behavior based on user activities and builds a contextual model to differentiate benign and malicious behavior[11].

Reference [4]

precomputes sensor correlation and the transition probability between sensor states and finds a violation of sensor correlation and transition to detect and identify the faults.

The main difference of these existing anomaly detector and our work is that our paper take time-interval into consideration and propose a novel scoring mechanism method for anomaly detection. To the best of our knowledge, time-interval between each two event isn’t used for anomaly detection in smart home prior to our work.

Iii Background

In this section, we introduce the background knowledge of smart home automation platform and events log.

Iii-a Smart Homes Automation Platform

Fig. 1: The architecture of smartthings platform

There are many smart home platforms now, like Smartthings, Amazon Alexa. These smart Home platforms provide so many types of sensor that can be used in smart homes. These sensors have different functions. In this paper, we use smartthings, which is a popular smart platform.

The architecture of smartthings is shown in Figure 1. Sensors are connected to each other and smartthings cloud via hub. As a result, sensors can upload its data to smartthings cloud for further analysis and user can control sensor via smarthings cloud.

Automation rule is an important part of smart home platform, which is set by user. When the status of sensor meet the prerequisite of automation rule, the status of other sensor would change. For example, the automation rule could be “when the door in bathroom open, the light in bathroom should be turned on/off”. As a result, when user opens the door to walk into bathroom at the first time, the light in bathroom is turned on. Then when user opens the door to walk out bathroom, the light in bathroom is turned off.

Iii-B Sensor Activity Log

Timestamp Device Value
10/1/2021 13:00:01 motionSensor active
10/1/2021 13:00:02 light on
10/1/2021 13:01:00 light off
…… …… ……
TABLE I: Example events collected from smart home IoT system.

User’s acitivity will active multiple sensor, which would be recorded in log. The log can be downloaded from smartthings app, which can be simplified as Table I

. When sensor is activated, the log will record the timestamp, device name and value of the device at this moment. The value could be numerical values, like “56.0”, or boolean value.

Iv Threat Model

In this paper, we consider smart home anomalies that are caused by the following reasons:

  • Faulty devices. Smart home devices may lost the connection to hub at any time and has no response to user’s behavior if they are broken up. When the device is offline, there would be no record about this device in the log. For example, when motion sensor is out of power, it will be offline and lose connection with the hub. So, smart home system can’t detect any motion and the automation rule based on motion sensor will be invalid.

  • Large Delays. When sensor is activated, it should report the event immediately to hub and this event would be added to log in smartthings cloud. But, due to interference from other devices, sensor may report the event to hub with a large delay. Large delays would cause disorder in log. Also, the automation rule based on the event will be invalid because the prerequisite of it isn’t met, which may lead to risky consequence. For example, the door may be left unlocked caused by loss of presence off event when user leaves home[6].

  • Intruder. We use the event sequence in log to profile user’s daily activity. When an intruder enter the building, since its behavior is different from user, the event sequence pattern changes. For example, when user go back to home, the behavior could be “open the door, close the door, enter hallway, enter kitchen”. But, when intruder enter the building, its behavior may be “open the door, close the door, enter hallway, enter bedroom ”. Since the behavior is different, event sequence will not be the same. Also, even intruder follows the pattern of user’s daily behavior, the time-interval between each two event may be not the same. For example, user often takes a nap in bedroom while intruder just takes a look at bedroom, the time interval between “open the bedroom door” and “close the bedroom door” would be significantly different.

  • Anomalous activities. Since user’s behavior is profiled based on previous collected event sequence. Once anomalous activity happens, the event sequence will be different and can be regarded as anomaly. For example, if user takes a shower, the event sequence is “open the door, active motion sensor, close the door, active motion sensor, open the door”. But, if user have a fall and get in shock, the event sequence is ”open the door, active motion sensor, close the door”, which is different from the previous one.

Due to the reasons mentioned above, the anomalous cases of event sequence is different from normal case of event sequence in the following aspects:

  • Order of Event. Compared with the event sequence in normal case, the order of event sequence in anomalous case is different .

  • Time-interval. Time-interval between two adjacent event become larger or smaller in anomalous case.

V System Design

To cope with the aforementioned anomalies, researchers proposed a series of anomaly detection methods [9, 1] that aim to provide early warnings on anomalous situations before causing damages. Our method improves the existing ones by merging time-intervals between events into the anomaly detection model. In this section, we first introduce our formal representations of event sequence patterns and then discuss how we deal with the additional temporal information using our innovative scoring mechanism.

V-a Sequence Pattern Representation

We use to represent device events. For a specific event of a device, we represent it in the form of . For example, the event of “bedroom motion sensor detect active motion” is represented as . Users’ common daily activities usually trigger multiple devices’ events consecutively, which forms certain sequence patterns. For example, if user use toilet, he would open the door, turn on light and close the door. As a result, contact sensor in the door, light in toilet, motion sensor in toilet are activated. Aside from the sequence of events, we also take the time-interval between each two adjacent events into consideration. In the example above, we also record the interval between “Open the door, active motion sensor, turn on light, close the door”. As a result, the event sequence pattern of a user’s activity can be represented in (1).

(1)

means the event of attribute on device turns to state . While, stands for the time interval between the th and the th events.

V-B Scoring Mechanism

With the repeat of user’s daily activities, there would be many frequent sequences in the log of event sequence. For example, if user use toilet, he would like to open the door, turn on the light and close the door. So after the repeat of this activity, a frequent event sequence “open the door, turn on light, close the door” in log is observed. After getting the frequent event sequence, we can extract all event sequence which is identical the frequent event sequence from the log. The we can get the average time-interval of each two event using the extracted event sequence. The frequent event sequence and the vector of average time-interval of each two event can be used to represent the ground truth for this frequent event sequence pattern, which is:

(2)

Where is the frequent event sequence, is the average time interval between event and event .

Compared with the ground truth of event sequence, test event sequence is missing some event or the time-interval between each two event is abnormal. Test event sequence can be represented as:

(3)

Where is the test event sequence, k is the number of event and .

We propose a scoring mechanism to quantify the similarity between test event sequence and ground truth, which can be represented as follows:

(4)

Where is the score of test event sequence, is the number of event in test event sequence, is the number of event in ground truth, is a coefficient, is the angle between vector and vector , is the time-interval vector of test event sequence, is the modified time-interval vector of ground truth. Since some event is missing in test event sequence, the event doesn’t exist in test event sequence is deleted from to make sure has the same length with .

Vi Performance Evaluation

Vi-a Experiment Setup

We evaluate our proposed anomaly detection method on a real-world testbed as shown in Figure 2. All used devices’ labels, attributes and installation locations are listed in Table II. The testbed consists of two bedroom, a dining room, and a bathroom and has one resident. In four rooms, we install 14 IoT devices of 4 types. All devices are connected to a SmartThings hub and managed using the SmartThings mobile app [13].

Fig. 2: Floor plans of the testbed and device deployment layout.
Label Device Name Attributes Deployment
M
SmartThings
Motion Sensor
motion on wall
C
SmartThings
Contact Sensor
contact, acceleration on doors
L
SmartThings
Light Bulb
switch as ceiling light, lamp
V
ThreeReality
Smart Switch
switch to control fan
TABLE II: IoT devices used in the testbed, their abbreviation labels, attributes and deployment information

Vi-B Data Collection and Augmentation

On this testbed, we evaluate our method on 3 activities that include: “Come back home”, “Go to work” and “Use toilet”. For each activity as listed in Table III, we deploy some automation rules according to the participant’s requirements and collect about 50 instances for each activity during 10 days experiment. All events are collected from SmartThings mobile app.

Activity Behavior devices
Automation rule
Come
back
home
User opens and
closes the front
door, enters
kitchen
M2,
C1,
L1,
M1,
L2
Use
toilet
User enters the
bathroom, closes
the door,
takes a pee,
opens the door
and walks out
M5,
C5,
L5,
V
Go
to
work
User opens the
door, walks
into study and
sits before
the desk
C2,
M4,
L4,
M3,
L3
TABLE III: Activities used to evaluate the proposed method

To improve the effect of the collected data, we augment the collected events of activity instances. More specifically, we use synthetic minority over-sampling technique (SMOTE) [2]

to expand the collected dataset. SMOTE is mainly used in over-sampling the minority (abnormal) class and under-sampling the majority (normal) class to achieve better classifier performance.

Because the difference between nearest neighbor and the selected data would be tiny, we randomly select two instances of normal data rather than select k-nearest neighbors. The detailed process can be represented as:

(5)

Where and is the vector of the collected data, is the vector of the created data.

Since anomalies rarely happen during the experiment period, we generate data of abnormal instances by simulating anomalies based on data of normal instances. We simulate anomaly cases in two methods: 1) deleting one event from a normal event sequence; (2) extend one time interval in an event sequence by 50 times. We call these two types of anomaly cases as “Anomaly(seq)” and “Anomaly(ti)”, respectively.

Fig. 3: The influence of alpha on score. Green points are scores of normal data instances, purple points are for anomaly(seq), red point are for anomaly(ti).

Vi-C Training Process

In the training process, we use collected normal and anomaly instances to find the optimal and score section. We use 40 instances of normal data, 10 instances of anomaly(seq) data, and 10 instances of anomaly(ti) for each activity.

Equation (4) shows that we can calculate the score of test event sequence by comparing it with a given reference sequence. The score is affected by the value of . To explore the impact of the value of , we calculate the score of all data instances in the testing dataset with different values in the range of . From Figure 3, we can select the that the cluster of green point is obviously divorce from the cluster of purple point and red point. For each selected , we can find the score section in which green point dominate over purple point and red point. So this score section can be selected as the threshold to distinguish normal data instances with anomaly instances.

By setting as , we can get the score threshold boundaries of , , and for normal activities of ”Come back home”, ”Go to work”, and ”Use toilet”, respectively.

Vi-D Testing Results

In the testing process, we apply the learned score range in the training process to 100 testing data instances for each activity to check the performance of anomaly detection. Each activity contains 20 instances of anomaly(seq) data, 20 instances of anomaly(ti), and 60 instances of normal data. None of the testing data is used in the training process.

We use True Positive (TP), False Negative (FN), False Positive (FP), True Negative (TN) and accuracy (represented as the eqution shown below) to evaluate the result on three activities. The testing result for each activity is shown in Table IV, Table V and Table VI.

(6)
Amount TP+TN FN+FP Accuracy
Anomaly(seq) 20 20 0 100%
Anomaly(ti) 20 16 4 80%
Normal 60 57 3 95%
Total 100 93 7 93%
TABLE IV: Testing results of activity “Come back home”

Table IV is the testing result of activity “Come back home”. The average accuracy for all category is 93%. Accuracy of anomaly(seq), anomaly(ti), normal is 100%, 80%, 95%, respectively.

Amount TP+TN FN+FP Accuracy
Anomaly(seq) 20 9 11 45%
Anomaly(ti) 20 20 0 100%
Normal 60 59 1 95%
Total 100 88 12 88%
TABLE V: Testing results of activity “Go to work”

Table V is the testing result of activity “Go to work”. The average accuracy for all category is 88%. Accuracy of anomaly(seq), anomaly(ti), normal is 45%, 100%, 95%, respectively.

Amount TP+TN FN+FP Accuracy
Anomaly(seq) 20 20 0 100%
Anomaly(ti) 20 9 11 45%
Normal 60 60 1 100%
Total 100 89 11 89%
TABLE VI: Testing results of activity “Use toilet”

Table VI is the testing result of activity “Use toilet”. The average accuracy for all category is 89%. Accuracy of anomaly(seq), anomaly(ti), normal is 100%, 45%, 100%, respectively.

Vii Conclusion

In this paper, we proposed a new learning model that takes time interval among events into consideration. We designed a new score metric to quantify the temporal similarity between a testing event sequence and a reference sequence of ground truth. We evaluated the performance of the proposed method with 10 days experiment in a real-world testbed. The results showed our method achieves the accuracies of 93%, 88%, and 89% for three testing daily activities.

References

  • [1] U. Bakar, H. Ghayvat, S. Hasanm, and S. C. Mukhopadhyay (2016) Activity and anomaly detection in smart home: a survey. Next Generation Sensors and Systems, pp. 191–220. Cited by: §V.
  • [2] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer (2002) SMOTE: synthetic minority over-sampling technique.

    Journal of artificial intelligence research

    16, pp. 321–357.
    Cited by: §VI-B.
  • [3] H. Chi, Q. Zeng, X. Du, and L. Luo (2021) PFIREWALL: semantics-aware customizable data flow control for smart home privacy protection. arXiv preprint arXiv:2101.10522. Cited by: §II.
  • [4] J. Choi, H. Jeoung, J. Kim, Y. Ko, W. Jung, H. Kim, and J. Kim (2018) Detecting and identifying faulty iot devices in smart home with context extraction. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 610–621. Cited by: §I, §II, §II.
  • [5] E. Fernandes, J. Jung, and A. Prakash (2016) Security analysis of emerging smart home applications. In 2016 IEEE symposium on security and privacy (SP), pp. 636–654. Cited by: §I.
  • [6] C. Fu, Q. Zeng, and X. Du (2021) Hawatcher: semantics-aware anomaly detection for appified smart homes. In 30th USENIX Security Symposium (USENIX Security 21), Cited by: §I, §II, 2nd item.
  • [7] T. W. Hnat, V. Srinivasan, J. Lu, T. I. Sookoor, R. Dawson, J. Stankovic, and K. Whitehouse (2011) The hitchhiker’s guide to successful residential sensing deployments. In Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems, pp. 232–245. Cited by: §I.
  • [8] Its-too-cold.groovy. External Links: Link Cited by: §I.
  • [9] V. Jakkula and D. J. Cook (2008) Anomaly detection using temporal data mining in a smart home environment. Methods of information in medicine 47 (01), pp. 70–75. Cited by: §V.
  • [10] N. Papernot, P. McDaniel, A. Sinha, and M. P. Wellman (2018)

    Sok: security and privacy in machine learning

    .
    In 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 399–414. Cited by: §I.
  • [11] A. K. Sikder, L. Babun, H. Aksu, and A. S. Uluagac (2019) Aegis: a context-aware security framework for smart home systems. In Proceedings of the 35th Annual Computer Security Applications Conference, pp. 28–41. Cited by: §I, §II.
  • [12] Smart home device household penetration in the united states in 2019 and 2021. External Links: Link Cited by: §I.
  • [13] Smartthings. External Links: Link Cited by: §VI-A.
  • [14] M. Yamauchi, Y. Ohsita, M. Murata, K. Ueda, and Y. Kato (2020) Anomaly detection in smart home operation from user behaviors and home conditions. IEEE Transactions on Consumer Electronics 66 (2), pp. 183–192. Cited by: §I.