In contrast to extensive research on client-side smart home security, the security of the cloud servers that control these client-side devices has received less attention from researchers. This is particularly concerning—e.g., recent work shows that if an attacker can control enough high-wattage IoT devices, the attacker can cause power grid failures . Implementing attacks at scale by compromising individual smart home devices is not straightforward due to the sheer number of devices that must be hacked. Thus, compromising cloud servers can be a more practical approach. In fact, the year of 2018, alone, has seen a huge number of compromises to cloud servers, including those operated by many well-known companies (e.g., Facebook , Marriott , Sony , etc.)
In addition to large-scale attacks on physical infrastructures, there are other concerns with trusting the cloud: (1) Privacy: Smart home devices collect a great deal of information about users . Private data can be leaked if a cloud server is compromised. (2) Traffic Analysis: It may be possible to learn information by observing traffic patterns . For example, such an analysis could reveal that when the thermostat’s mode transitions (e.g., from Home to Away), it always sends a packet of a specific length . By watching traffic, attackers could discover whether people are home.
Our Approach: Fidelius
Motivated by these security and privacy concerns, this paper presents Fidelius. Fidelius provides consistency and security to smart home devices even in the presence of compromised servers in realistic smart home environments that include the presence of intermittent network accesses. Figure 1 presents an overview of the Fidelius system. A Fidelius deployment consists of (1) an untrusted cloud-based server that provides connectivity between clients, (2) any number of clients that are smart home devices, and (3) any number of clients that are smartphone apps. Our server implementation is architected as a FastCGI server that communicates with the Apache Web Server. We focus our work on providing a secure key-value storage system targeted at IoT applications.111 The Nest thermostat API communicates data in JSON as a tree of key-value pairs that can be easily stored using a key-value abstraction. Similarly, Apple HomeKit associates a set of properties and corresponding values with each device. While existing systems use adhoc protocols for communicating data, the standard key-value abstraction is powerful enough to subsume a wide-range of adhoc protocols. We do not impose special requirements on server hardware; instead, the Fidelius enforcement runs on each client device. Clients communicate with the server; in the absence of Internet connectivity, they can also communicate with each other locally to maintain functionality.222This is already supported in many smart home systems today such as LiFX and Hue light bulbs, WeMO outlets, and TP-Link outlets. This design is well matched for the smart-home environment where devices are mutually connected in a local home network.
This paper makes the following contributions:
Secure Transactional Key-Value Store: It presents a key-value store that provides strong security and privacy guarantees even if the server is malicious.
Local Control: It presents an algorithm supporting local control of smart home devices when connectivity is lost.
Transactional Programming Model: It presents developers with a transactional model to abstract consistency and availability tradeoffs that arise from network partitions.
Evaluation: Compared to Particle.io, Fidelius reduces more than 50% of the data communication time and increases battery lifetime by 2. Compared to PyORAM, Fidelius has 4-7 faster access times with 25-43 less data transferred.
2 Threat Model
Fidelius focuses on preventing attackers from using compromised servers to mount large scale attacks on smart home devices. The problem of client security for smart home devices is thus outside of the scope of this paper.333 Fidelius can be easily used together with any existing client-side security enforcement technique to provide a comprehensive solution to the problem of smart-home IoT security. We assume that the adversary has leveraged other vulnerabilities to obtain complete control over the cloud server that is used by the devices to communicate. We assume that the adversary has knowledge of Fidelius but does not know the secret key to which only the clients have access.
Fidelius provides the following guarantees, which are formalized in the technical report :
Fidelius provides oblivious privacy. Fidelius does not leak any information from data access patterns. However, Fidelius may leak information from timing.
Fidelius guarantees the integrity of the message passing layer. It ensures that any violation of integrity is detected by some client and that even in the case of a detected integrity violation, the consequences for a client is limited to the same set of behaviors that are allowed by the non-determinism from the network, i.e., could have arisen with a honest server receiving messages from clients in a different order.
A malicious server can implement a denial-of-service (DoS) attack by not forwarding messages. A malicious server can also partition clients into disjoint groups that it does not permit to communicate. This attack is impossible for a client to distinguish from a network failure that prevents the server from communicating with a subset of clients, and thus Fidelius provides fork consistency  by default.
3 The Fidelius System
A key challenge in the design of Fidelius is how to handle the wide-range of possible malicious behaviors from the server while at the same time support local control during Internet outages as well as intermittent availability of power-constrained devices. To separate the concerns of handling malicious behaviors and implementing the key-value-store functionalities, we architected Fidelius as two layers: (1) a message passing layer that guarantees consistent message delivery in the presence of malicious servers. This layer handles all issues related to malicious behaviors and does not implement any key-value-store functionalities; and (2) a transactional key-value store built on top of the message passing layer. The app code runs on top of these two layers in each client—app developers can use the Fidelius API library in their code. On smart home devices, this code is the device controller code. On smartphones, this is the actual app code.
3.1 Message Passing Layer
Conceptually, the functionality of the message passing algorithm is simple: the server algorithm maintains a queue of all messages that are currently in use; the client algorithm supports (1) sending messages to the server queue and (2) requesting a copy of the new messages in the current queue. In addition to forwarding messages between clients, the server stores enough state so that new or long-absent clients can completely reconstruct the key-value store using the state in the server’s queue.
The Fidelius message passing layer uses a message chain to totally order communications between clients that are sent via the cloud server. Figure 2 presents the structure of the message chain. Each message in the chain has a fixed size and contains (1) a globally unique sequence number, which is used to reference and order messages; (2) a keyed-hash message authentication code (HMAC) of the previous message in the chain, which is used to ensure the integrity of chain; and (3) a set of records, which can contain either data from the key-value store layer or control information that is needed to thwart attacks—we insert as much data as possible until the entire message reaches a certain fixed size specified through the Fidelius API. Finally, each message contains an HMAC of itself, which ensures that the message cannot be modified. The HMACs are generated using a key known only to the clients to ensure that the server cannot forge messages. When a message is transported between clients, it is always encrypted using the transport key. Together, this ensures that the server cannot know the contents of messages nor can it modify them.444 Although the entire process might seem heavy for IoT devices that have limited resources, our evaluation energy consumption suggests that Fidelius can still perform efficiently (see Section 4).
The server maintains a queue of the last messages it has received. Each message has a fixed-size encrypted block for which the server does not have the decryption key and a plain text sequence number . The server algorithm supports two requests: (1) putmsg: add a new message555The server could first authenticate participating clients to avoid amplification attacks. However, this is out of the scope of this paper. and (2) getmsg: request messages that are currently in the queue. Before adding a new message, it checks that its sequence number is one greater than the previous sequence number. If not, it sends a rejection message back to the client that contains all messages in the queue with a sequence number that is greater than or equal to . Note that the clients must verify that the server correctly assigns these sequence numbers as we do not trust the server. If a client detects that the key-value store state does not fit in the current queue, it can request that the server changes the size of the queue. If a partial message is received, e.g., due to a network failure, the server ignores the partial message. If queue is full when a new message arrives, the server drops the oldest message. The client algorithm is responsible for ensuring that the information in the oldest message has been moved to a newer message before the message is dropped. A request for messages in the queue contains a sequence number that is one larger than the most recent message the client has received. When the server receives this request, it sends all messages in its queue with sequence numbers that are equal or larger than such that the client now has a up-to-date copy of the message chain.
Clients communicates by sending and receiving messages through the server. Our implementation uses the CTR mode of AES to encrypt messages. A unique counter value is sent to the server along with the message. Clients share two secrets that allow them to authenticate and encrypt messages: a message authentic key and a message encryption key. These are not known to the server.
When the message passing layer receives new messages from the server (or they are created locally), the client decrypts and validates the messages. Records in messages at the message passing layer are used to both communicate information for the key-value store layer and to thwart attacks that a malicious server may otherwise perform. The proof in the technical report  shows that these checks suffice to thwart all server attacks: (1) Message Chain Integrity: To ensure the integrity of messages, after decrypting a message the client compares: computed HMAC against the stored HMAC, from the server against in the message, and the HMAC of the current message against the HMAC of the message immediately proceeding it, if present, in the queue. These are used to ensure that the server has not manipulated the message chain. (2) Detecting Dropped Messages: A malicious server could potentially acknowledge a message from a client and then silently drop the message while a client is offline, replacing it with a message with the same sequence number from another client. To ensure that clients always detect such dropped messages, the client algorithm uses last message records to track the last message sent by itself and other clients so that it can detect whether the server has dropped messages. A last message record consists of a machine ID and the sequence number of the last message sent by that machine. Last message records are only inserted if the most recent message from a given machine ID is about to be evicted from the queue. (3) Detecting Reused Rejected Messages: The client algorithm uses rejected message records to detect if the server attempts to use a rejected message. A rejected message record consists of the machine ID of the client that sent the rejected messages, the lowest sequence number and highest sequence number of the range of rejected messages, and the sequence number of the first message that contained the rejected message record. To prevent this attack, each client verifies that each message in the queue that has a sequence number that falls within the range of and has a machine ID that is not equal to . The algorithm keeps a rejected message record live until all clients have seen it (as implicitly acknowledged by sending a newer message).666Since Fidelius does not keep the entire message chain, the server could temporarily fork the message chain and then move clients from one fork to another if the client is offline for a sufficiently long time such that all the messages the client has seen have been evicted from the message chain. In the other fork, the server could could send dropped or rejected messages. (4) Detecting Missing Messages: A malicious server could also attempt to send fewer messages than the queue currently holds by omitting the oldest messages (noting that the message chain integrity checks prevent dropping messages in the middle). This would cause clients to compute the key-value store state using an incomplete set of messages. To prevent this, Fidelius uses queue size record to store the current maximum capacity of the message queue and enables clients to compute how many messages they should have received to validate that the correct number of messages was received.
The message passing layer sends a message and performs the following bookkeeping tasks: (1) Determine Which Records Must be Refreshed: If a record is live (the system is still using the information they contain) then it must be refreshed by reinserting the record into the queue before the message with that record leaves the queue. A record is dead iff one of the following conditions is true: (i) a refreshed version of is present in the queue,777A resize check is performed first to check if the number of messages with records which are live in the queue exceeds a randomized resize threshold. If so, a new queue size is calculated and a new queue size record with the new size is inserted into the message. (ii) is a queue size record and there is a newer queue size record, (iii) item is a rejected message record and all clients have seen it as proven by having inserted a message into the queue with a that is greater than the of , or (iv) is a last message record and a client with the same has inserted a newer message into the queue. (2) Construct the Record Section of the Message: Constructing the record section of a message involves (i) checking if the queue needs resizing, (ii) generating and inserting a rejected message record if needed, (iii) refreshing older live records before they are evicted, (iv) inserting the record from the key value store layer, and (v) filling unused space in the record section with old live records in need of a refresh. (3) Send the Message: The client next sends the message to the server. If the server rejects the message because it has an old sequence number, then the server returns the newer messages to prove that the rejected message was in fact old. The client then processes all of the new messages that were returned by the server and will generate a rejected message record to communicate the fact that it sent a message that was rejected. If the server accepts the message, then the client’s local state is updated using the contents of the newly created message. If sending the message fails due to network issues then the message passing layer has to defer its determination of whether other clients received the message and the transactional key-value store layer is informed that a network failure occurred.
A reader may note that the pattern of putmsg and getmsg requests can leak access pattern information. This issue can be addressed in several different ways. We modified the update procedure to always follow a getmsg call with a putmsg call (simply sending a message that refreshes entries if there is no request to send). Nevertheless, Fidelius does not prevent leaks via timing channels. For some settings, it is feasible to configure Fidelius to not leak information by sending data on a fixed schedule.888 We have observed that many smart home devices frequently send messages to the cloud even when idle: the Nest thermostat sends a packet every 5 seconds, the LiFX light bulb every 7 seconds, and the D-Link smart plug every 1.5 seconds.
Strengthening Consistency Guarantees
Although we believe that fork consistency is sufficient for smart home systems, Fidelius supports extensions to counter forking attacks and achieve strong consistency .
3.2 Transactional Key-Value Store
Fidelius’s transactional key-value store is built on top of the message passing layer. This layer (1) arbitrates/commits transactions, and (2) updates/reads key-value pairs.
Intermittent home Internet connectivity complicates committing transactions. There is the potential for concurrent local updates and remote updates to conflict. In the case of an Internet outage, the cloud server would not even be aware of local updates. For example, a remote smartphone attempts to change the thermostat mode from HEAT to OFF with a transaction. At the same time, a local smartphone tries to change the mode from HEAT to COOL with a transaction. These two transactions conflict and only one transaction can commit and thus we need an arbitrator to decide which one commits. To allow local control during an Internet outage, the arbitrator of this transaction must be local to the thermostat and the obvious choice is the thermostat itself.
Updating/Reading Key-Value Pairs
Fidelius updates key-value pairs through a Put function. These updates are stored locally until the transaction is committed. Fidelius supports a relaxed transactional model for reads. In many cases it is not important that a transaction reads the absolutely latest value from a sensor.
We developed both C++ and Java implementations of the Fidelius client and a C++ implementation of the Fidelius server.999 Most existing smart home systems are closed source and it is not clear what guarantees are provided when there are conflicts. Thus, we cannot implement our approach to securing smart home devices against malicious cloud servers directly on these devices. Instead, we implemented a complete system (with an API library) that uses transactions to support local communication with clear consistency properties. The server for all experiments was a 3.5GHz Intel Xeon E3-1246 v3 with 32GB of RAM. We have evaluated Fidelius by (1) developing a test bed system, (2) comparing with the commercial Particle cloud in terms of energy usage and privacy, and (3) comparing with a Path ORAM implementation .
Our test bed system simulates a medium scale smart home deployment with 16 devices. We have two classes of smart home devices in our test bed: (1) 15 low-power nodes that use hardware similar to smart home devices that run on batteries for many months (8 Particle Photons with temperature and humidity sensors, 4 Particle Photons with magnetic door sensors, 3 Particle Photons with IR-based motion sensors) and (2) one mid-range node that uses hardware that is more typical of smart home devices that run on wall power (a Raspberry Pi 1 that controls 2 LiFX smart light bulbs).101010 The Particle Photon is a low-end IoT hardware development kit with a 120 MHz ARM Cortex M3 processor, 1MB of flash, and 128KB of RAM that supports 802.11b/g/n WiFi. These specifications are similar to the hardware that appear in commercial devices.
We quantified the CPU overhead of Fidelius for typical smart home device activity (controlling LiFX light bulbs) on a smart home class CPU (a Raspberry Pi 1) to verify that Fidelius incurs acceptable overheads. For this experiment, the LiFX arbitrator had a 2.3% CPU utilization and the Android app had below 1% CPU utilization.
To validate that Fidelius detects attacks, we used a malicious server that implements three different attacks. In the first attack, the server assigns the sequence number to multiple messages. In the second attack, the server inserts a rejected message into the queue. In the third attack, the server swaps message sequence numbers of accepted messages. The clients detected all attacks.
Particle Cloud—Energy and Privacy
We evaluated Fidelius against the Particle cloud framework as the Particle cloud framework has an easily available SDK that has been used to develop many commercial IoT products. We configured a Particle Photon to read a humidity/temperature sensor and powered it with 3 AA batteries. When using the Particle cloud, it takes at least 5.55 seconds to wake up, connect, publish value, and go back to sleep. The total energy consumed for the wake-up time of 5.55 seconds is 0.79mWh. Assuming that it wakes up every hour to do one measurement and goes back to sleep, the total energy consumed is 1.17mWh each hour for deep sleep plus reporting a measurement. Three Energizer AA Lithium batteries provide approximately 13.5Wh and thus, could support measurements for 1 year and 4 months. When using Fidelius server, it takes only 2.46 seconds to complete the same measurement cycle. Hence, its hourly power consumption is 0.73mWh, and it could support measurements up to 2 years 2 months.
The network trace shows that the Particle cloud leaks information about the argument lengths. The publish() method, which allows a device to publish results on the Particle cloud in a key-value fashion, sends different packet lengths for different argument lengths, e.g., a 16-character argument results in a 34-byte payload, a 32-character argument results in a 50-byte payload, etc. The function() method, which allows devices to publish C code functions on the Particle cloud, behaves in a similar way. For Fidelius, the network trace only shows a getmsg is always followed by a putmsg request with uniformly-sized data blocks. We randomized the wake-up time of our sensors that need to give periodic updates—i.e., temperature/humidity sensor. Since every Fidelius request had the exact same traffic pattern, we were not able to determine the type of Fidelius request from the traffic.
PyORAM, a Python implementation of the Path ORAM algorithm .111111 Path ORAM is one of the few cloud-based oblivious storage system implementations that are publicly available . Path ORAM assumes the weaker honest but curious adversarial model. We tried to run PyORAM’s Path ORAM on the Raspberry Pi as a client. The results show an increasing trend of time per access for PyORAM as the tree height and number of blocks grow, while for Fidelius time per access is always steady. For 255 blocks, Fidelius was already 7 faster than PyORAM—this difference is caused by the nature of PyORAM benchmark that is bandwidth limited. As the tree height and number of blocks grow, the total amount of transferred data also increases faster than Fidelius—for 511 blocks, PyORAM transfers more than 43 the amount of data that Fidelius transfers.
5 Related Work
Oblivious RAM techniques, first proposed by Goldreich and Ostrovsky , seek to provide RAM [21, 13, 19, 16, 11] or cloud storage [17, 18, 6] in which the data access patterns are independent of the computation and thus no information can be extracted from the locations accessed. Many oblivious storage algorithms [7, 19] assume that only one device can access the storage and that adversary is merely honest but curious. While some work does support multiple clients  or stronger adversarial models , these approaches assume that the clients have limited local storage and thus incur bandwidth and/or latency overheads for each access.
Much previous work on oblivious storage is unsuitable for our scenario because: (1) it assumes an honest, but curious adversary, and (2) it does not support multiple clients accessing shared state. Some recent work has removed the single client limitation  and supports stronger adversarial models . However, most oblivious RAM techniques incur extra overheads (e.g., latency or bandwidth) for accesses to storage due to a fundamental limitation as a result of the assumption that clients only have a small amount of storage. Moreover, existing systems are not designed to support updates to persistent state when connectivity has been lost and thus cannot support local device control.
We presented Fidelius, a new secure key-value store, that is designed to meet the requirements of smart home systems. We prove that Fidelius is secure, even in the presence of a malicious cloud server. Our experiments show that Fidelius has good performance, low power consumption, is resilient to attacks, and efficiently runs on smart home class hardware.
Fidelius is a privacy-preserving infrastructure with a specific context (i.e., key-value store). We are seeking feedback about the viability of Fidelius in the real-world setting with respect to the current trends for cloud architecture. We are also looking into future directions to extend Fidelius for more complex applications beyond key-value store model (e.g., cameras and video streaming devices).
First, Fidelius distributes key among devices. However, this could be a bad choice because the entire system will be compromised if a client gets compromised. Second, Fidelius tries to leverage encryption in the communication between IoT devices—this might be too heavy for IoT devices that typically have limited resources (e.g., processing power and memory). While a lot of devices have implemented their communication protocols on top of secure channels, e.g., TLS, there are still devices that communicate in clear text for simplicity. Third, Fidelius protects the privacy of clients by concealing the data from the cloud server. However, the current trend is that cloud servers typically access clients’ data for analytics purposes.
We hope that this paper could generate discussions with the following questions (with respect to the controversial points and the feedback we look for): (1) How viable is Fidelius to provide privacy and security for IoT devices with respect to current trends and practices? What could be the better scheme to distribute keys if we use encryption (with the assumption that the cloud server could be malicious)? Could we really trust the clients? (2) How could we develop Fidelius better in the direction that follows the current trend for cloud analytics? Fidelius may not be suitable for all IoT applications. However, it can be effective for certain scenarios, in which the user may not want the server to comprehend the data, e.g., when a company uses a public cloud service to run its applications, it needs to protect its users’ sensitive information. We realize that when using Fidelius, data analytics can still be performed using another machine (as a Fidelius client).
Fidelius provides security through encryption and packets with the same-length. However, there is still privacy leak through timing channels. Although, an obvious solution is traffic injection [3, 4], this could consume a lot of bandwidth: inefficient communication. Thus, we wonder if there could be a more efficient solution to secure the timing channels for IoT devices.
Circumstances for Failure
IoT devices typically have limited processing power and storage, especially for devices that are most likely to use the key-value store model. Thus, if there are too many devices that participate as a Fidelius client, the message chain can be too long to fit into the device’s local storage. As a result, the entire Fidelius system could fail.
-  (2017) A smart home is no castle: privacy vulnerabilities of encrypted IoT traffic. CoRR abs/1705.06805. External Links: Cited by: §1.
-  (2018-12) Revealed: marriott’s 500 million hack came after a string of security breaches. Note: https://www.forbes.com/sites/thomasbrewster/2018/12/03/revealed-marriotts-500-million-hack-came-after-a-string-of-security-breaches/ Cited by: §1.
-  (2014) CS-BuFLO: a congestion sensitive website fingerprinting defense. In Proceedings of the 13th Workshop on Privacy in the Electronic Society, pp. 121–130. Cited by: §7.
-  (2014) A systematic approach to developing and evaluating website fingerprinting defenses. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 227–238. Cited by: §7.
-  (2016) Is anybody home? Inferring activity from smart home network traffic. In Security and Privacy Workshops (SPW), 2016 IEEE, pp. 245–251. Cited by: §1.
-  (2015) Oblivious network RAM. IACR Cryptology ePrint Archive 2015, pp. 73. External Links: Cited by: §5.
-  (1996-05) Software protection and simulation on oblivious RAMs. Journal of the ACM 43 (3), pp. 431–473. Cited by: §5.
-  (2017-02) PyORAM. Note: https://github.com/ghackebeil/PyORAM Cited by: footnote 11.
-  (2018-09) Facebook security breach exposes accounts of 50 million users. Note: https://www.nytimes.com/2018/09/28/technology/facebook-hack-data-breach.html Cited by: §1.
-  (2004) Secure untrusted data repository (SUNDR). In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, Cited by: §2.
-  (2015) GhostRider: A hardware-software system for memory trace oblivious computation. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, Cited by: §5.
-  (2015) Multi-client oblivious ram secure against malicious servers. Note: Cryptology ePrint Archive, Report 2015/121http://eprint.iacr.org/2015/121 Cited by: §5, §5.
-  (2015) Constants count: practical improvements to oblivious RAM. In 24th USENIX Security Symposium, USENIX Security 15, Washington, D.C., USA, August 12-14, 2015., pp. 415–430. Cited by: §5.
-  (2013) Integrity verification for path oblivious-RAM. In IEEE High Performance Extreme Computing, Cited by: §5, §5.
-  (2018) BlackIoT: iot botnet of high wattage devices can disrupt the power grid. In 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, pp. 15–32. External Links: Cited by: §1.
-  (2012) Towards practical oblivious RAM. In 19th Annual Network and Distributed System Security Symposium, NDSS 2012, San Diego, California, USA, February 5-8, 2012, Cited by: §5.
-  (2013) Multi-cloud oblivious storage. In 2013 ACM SIGSAC Conference on Computer and Communications Security, Cited by: §5.
-  (2013) ObliviStore: high performance oblivious cloud storage. In 2013 IEEE Symposium on Security and Privacy, SP 2013, Berkeley, CA, USA, May 19-22, 2013, pp. 253–267. External Links: Cited by: §5.
-  (2013) Path ORAM: an extremely simple oblivious RAM protocol. In 2013 ACM SIGSAC Conference on Computer and Communications Security, Cited by: §4, §4, §5.
-  (2019) Securing smart home edge devices against compromised cloud servers. Note: https://bit.ly/fidelius-technical-report Cited by: §2, §3.1, §3.1.
-  (2015) Circuit ORAM: on tightness of the Goldreich-Ostrovsky lower bound. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12-6, 2015, pp. 850–861. External Links: Cited by: §5.
-  (2018) Smart home tech makers don’t want to say if the feds come for your data. Note: https://techcrunch.com/2018/10/19/smart-home-devices-hoard-data-government-demands/ Cited by: §1.
-  Sony pictures hack. Note: https://en.wikipedia.org/wiki/Sony_Pictures_hack Cited by: §1.