Pyronia: Intra-Process Access Control for IoT Applications

03/05/2019
by   Marcela S. Melara, et al.
0

Third-party code plays a critical role in IoT applications, which generate and analyze highly privacy-sensitive data. Unlike traditional desktop and server settings, IoT devices mostly run a dedicated, single application. As a result, vulnerabilities in third-party libraries within a process pose a much bigger threat than on traditional platforms. We present Pyronia, a fine-grained access control system for IoT applications written in high-level languages. Pyronia exploits developers' coarse-grained expectations about how imported third-party code operates to restrict access to files, devices, and specific network destinations, at the granularity of individual functions. To efficiently protect such sensitive OS resources, Pyronia combines three techniques: system call interposition, stack inspection, and memory domains. This design avoids the need for application refactoring, or unintuitive data flow analysis, while enforcing the developer's access policy at run time. Our Pyronia prototype for Python runs on a custom Linux kernel, and incurs moderate performance overhead on unmodified Python applications.

READ FULL TEXT VIEW PDF
03/05/2019

Pyronia: Redesigning Least Privilege and Isolation for the Age of IoT

Third-party modules play a critical role in IoT applications, which gene...
10/31/2018

Securing IoT Apps with Fine-grained Control of Information Flows

Internet of Things is growing rapidly, with many connected devices now a...
07/30/2019

EnclaveDom: Privilege Separation for Large-TCB Applications in Trusted Execution Environments

Trusted executions environments (TEEs) such as Intel(R) SGX provide hard...
05/12/2020

Towards Memory Safe Python Enclave for Security Sensitive Computation

Intel SGX Guard eXtensions (SGX), a hardware-supported trusted execution...
07/08/2019

StackVault: Protection from Untrusted Functions

Data exfiltration attacks have led to huge data breaches. Recently, the ...
02/20/2020

LibrettOS: A Dynamically Adaptable Multiserver-Library OS

We present LibrettOS, an OS design that fuses two paradigms to simultane...
01/20/2021

Thread Evolution Kit for Optimizing Thread Operations on CE/IoT Devices

Most modern operating systems have adopted the one-to-one thread model t...

1 Introduction

Concerns about data security in the Internet of Things (IoT) have been mounting in the wake of private data leaks in safety-critical technologies (e.g., [26, 27, 68]) and large-scale malware attacks exploiting vulnerable devices [56, 22, 12]. These concerns have driven application developers to deploy measures that secure data in-transit to cloud platforms [18, 6, 42], or that detect unauthorized code on devices [49, 32].

However, these safeguards cannot prevent vulnerable or malicious third-party libraries within

IoT applications from leaking sensitive information. Once a developer imports a vulnerable library, it runs with the application’s privileges and has full access to the application’s resources (e.g, files, devices and network interfaces). For example, a facial recognition library with a vulnerability in the

recognize_face() function could allow an attacker to steal the application’s private authentication token by upload this file to an unauthorized remote server. Meanwhile, the application developer only expected recognize_face() to access the image file face.jpg.

This is an especially serious problem for IoT because most devices run a single, dedicated application that has access to all sensors on the device. In other words, third-party code does not run with least privilege [48].

Now, these threats are not IoT-specific. A large body of prior research has sought to restrict untrusted third-party code in desktop, mobile, and cloud applications (e.g., [10, 11, 59, 54, 21, 19, 67, 36, 66, 65, 24, 55, 14, 41]). However, these traditional compute settings face more complex security challenges. Mobile devices, desktops, and even IoT hubs, run multiple mutually distrusting applications; further, desktops and cloud servers run in multi-user/multi-tenant settings. All of these settings require isolation between the different applications and principals.

In this paper, we examine the following question (§2): Are traditional approaches suitable for protecting IoT device applications against untrusted third-party code? Given the rapid proliferation of IoT devices and the high sensitivity of the data they handle, it is crucial to gain an understanding of the IoT-specific security challenges that developers face, in order to guide the design of systems that effectively enforce least privilege in this setting.

We conduct, to the best of our knowledge, the first in-depth analysis of third-party code usage in IoT device applications. Specifically, we characterize the IoT library landscape, and identify key risks to IoT applications (§3). Informed by our findings, we propose Pyronia,111Pyronia being a gatekeeper for third-party libraries, is named after the genus of butterflies known as gatekeeper butterflies. an access control system for untrusted third-party code in IoT applications written in high-level languages.

In Pyronia, we retain the goal of controlling when an application may obtain data from files and devices, and to which remote network destinations an application may export data. Pyronia enforces a central policy that specifies rules for directly imported library functions and the specific OS resources they may access. For example, to ensure that a sensitive image face.jpg is only accessible by the recognize_face() function, the developer specifies a read rule for the face.jpg in her policy. Pyronia then blocks all attempts by any other library functions to access the image file, as well as attempts by this function to access other sensitive resources. Thus, developers need not reason about third-party dependencies that may be unfamilar to them, and application and library source code can remain unmodified.

To enforce such function-granular access control, Pyronia leverages the following three techniques.

  1. [label=0)]

  2. System call interposition5.1) guarantees that access to OS resources by all application components can be controlled, even for native libraries integrated via a high-level native interface such as the Java JNI or the Python C API. However, system call interposition has traditionally only been realized as a process-level technique, and thus cannot handle intra-process access checks.

  3. Stack inspection5.1) allows Pyronia to identify all library functions involved in the call chain that led to a given system call. Thus, Pyronia leverages the language runtime call stack to determine whether to grant access to a requested resource based on the full provenance of the intercepted system call.

  4. Memory domains5.2) are isolated compartments within a process address space, each with its own access policy. Pyronia enforces boundaries between compartments via replicated page tables, protecting the language runtime call stack against tampering by native code in the same address space.

Pyronia targets popular high-level IoT programming languages like Java or Python [18] precisely for their ability to dynamically provide fine-grained execution context information. We implement Pyronia on Linux for Python (§6), although we believe our approach can be applied to other high-level languages. Our prototype acts as a drop-in replacement for the CPython runtime, and includes a custom Linux kernel with support for memory domains. Our function-granular MAC component is a kernel module built on top of AppArmor [8].

We evaluated Pyronia’s security and performance with three open-source IoT applications. Our security evaluation (§

7)shows that Pyronia mitigates reported and hypothetical OS resource-based vulnerabilities. We also find that Pyronia incurs moderate performance slowdowns, with a maximum operating overhead of 3x, and modest memory overheads with an average of 38.6% additional memory usage for the entire application.

2 Prior Work

Prior research on restricting untrusted third-party code and enforcing least privilege in traditional compute settings has used two main approaches.

Process isolation partitions a monolithic application into multiple processes and controls their permissions individually (e.g., [10, 11, 59, 54, 21, 64, 54]). However, process isolation imposes significant development and run-time overheads. Developers may have difficulty cleanly separating components into processes. In addition, inter-process communication is much more expensive than calling library functions within the same address space.

These issues indicate that the process abstraction is too coarse-grained for the efficient isolation of intra-process components in IoT applications. More recent work in this area introduces OS-level abstractions to create private memory compartments within a single process address space [39, 30, 13, 61]. However, these proposals lack built-in access control for OS resources, and still require major developer effort for deployment in high-level applications.

Thus, these approaches are insufficient to limit third-party libraries’ access to unauthorized IoT device resources. Additionally, given the rapid deployment cycles of IoT applications, developers are unlikely to prioritize security [9, 3, 4], and spend the time and effort necessary to refactor their applications. Thus, one key design goal of Pyronia is to support unmodified applications while enforcing least privilege within a single process.

Information Flow Control attaches policy labels to individual sensitive data objects and track how they flow through the system. IFC systems can be divided into two broad categories. OS-level IFC (e.g., [19, 67, 36, 66]) tracks primitives like files or network sockets, while language-level IFC systems (e.g., [65, 24, 55, 14, 41]) are capable of controlling access to sensitive data at granularities as fine as individual bytes.

Yet, much like process isolation, IFC systems introduce a considerable amount of cost and complexity: Developers must manually refactor their source code to specify policy labels around specific sensitive data sources and sinks. Additionally, propagating new labels at run time incurs significant memory and performance overheads.

Since third-party code may contain unexpected vulnerabilities, or a long list of dependencies, we cannot expect IoT developers to be able to perform extensive data flow analysis a priori to declare a data access policy that minimizes data leaks. Thus, access control for IoT should not require unintuitive policy specification.

IoT-specific Access Control. A number of recent works in the IoT space [33, 50, 29, 45, 58] propose access control systems to enable developers and end-users to define and enforce more suitable data access policies based on external factors such as usage context or risk. Pyronia, in contrast, focuses on allowing developers to restrict third-party code decoupling end-user application usage policy enforcement from a developer’s implementation policy.

The FACT system [37] aims to prevent overprivileged applications from accessing sensitive device functionalities and resources by executing different IoT device functions in separate Linux containers. However, FACT does not protect these resources against untrusted third-party code running as part of the isolated device functionalities, as Pyronia would. FlowFence [21] shares the same goals as Pyronia, but still relies on process isolation for in-application privilege separation, and does not support unmodified applications.

3 IoT Application Development Today

To design Pyronia, we conducted an in-depth study of 85 open-source IoT applications written in Python and their libraries, as well as a brief analysis of reported vulnerable Python libraries. Our analyses focus on Python as it is a popular IoT development language [18].

3.1 Application Analysis

To better understand how third-party libraries influence IoT application design today, we analyzed 85 open-source IoT applications written in Python. We obtained these applications primarily from popular DIY web platforms such as instructables.com and hackster.io; our search focused on three broad categories of applications—visual, audio, and environment sensing—which we believe are representative of today’s most privacy-sensitive IoT use cases. The four key takeaways of our analysis are:

min median mean max
# direct imports 1 8 13 253
# direct 3p imports 0 3 6 186
# dependency levels 1 25 22 37
# lib dep levels 0 27 24 34
Table 1: Analysis of direct imports and number of dependency levels in a set of 85 Python IoT applications and in the top 50 third-party libraries imported by these applications. Unless noted otherwise, results of per-application analyses are shown.

1. The vast majority of IoT applications import third-party libraries. All but one of our sampled applications (98.8%) import at least one third-party library, with a mean of about 6 direct third-party imports per application. The maximum number of direct imports we found in a single application was 186 (see Table 1). This demonstrates that attacks from imported libraries within an application’s process are much more realistic due to the single-purpose nature of most IoT devices, as opposed to threats across process boundaries as seen in traditional desktop and server settings.

2. The third-party library landscape is very diverse. Overall we found 331 distinct third-party Python libraries among the 418 total imports in our sampled set of applications, despite heavily sampling applications targeting the Raspberry Pi single-chip computer (a very popular development platform for IoT). Thus, providing intra-process access control for IoT device applications requires an application- and library-agnostic approach.

Library feature % of top 50 libs
Written in Python 12.0%
Have Native Deps 82.0%
Run external binaries 40.0%
Use ctypes 40.0%
Table 2: Characteristics of the top 50 IoT Python libraries, including libs exhibiting multiple characteristics.

3. Libraries rely predominantly on native code. We find that 82% of the top 50 libraries in our analysis include a component written in C/C++, among which we identified 68 distinct native dependencies. Additionally, a large portion of libraries load a native shared library via the ctypes Python foreign function interface, or execute external native binaries (including a Python subprocess). These results reveal how exposed applications and the Python runtime are to threats from vulnerable or malicious native code, and underscore the importance of intra-process memory isolation to protect security-critical data against tampering or leaks native code.

4. Dependencies are nested dozens of levels in IoT applications. Finally, we analyzed the longest chain of nested libraries for each application and top-50 library in our sample, and find that across the 85 sampled applications, the median number of nesting levels is 25, while the median library has 27 levels of nested dependencies (see Table 1). These numbers indicate the complexity of single applications and libraries, and highlight why developers need more intuitive fine-grained access control that does not require separating each library into its own process, or identifying the sensitive data flows within a single application.

Attack class # Reports # Libs Lib/framework (# reports)
Arbitrary code execution 28 24 python-gnupg (4)
MITM 19 14 urllib* (3)
Web attack 18 12 urllib* (4)
Denial of service 17 12 Django (3)
Direct data leak 12 10 requests (3)
Weak crypto 11 10 PyCrypto (2)
Authentication bypass 9 6 python-keystoneclient (3)
Symlink attack 6 4 Pillow (2)
Replay/data spoofing 3 3 python-oauth2
Table 3: Reported Python library vulnerabilities and number of unique libraries by attack class, from 2012-2019.

3.2 Library Vulnerabilities

Our survey of reported Python library vulnerabilities covers 123 reports created between January 2012 and March 2019 in the Common Vulnerabilities and Exposures (CVE) List [57]. We identify 78 distinct vulnerable Python libraries, and 9 main attack classes (see Table 3)222For a full list of the CVE reports included in our analysis, see Appendix A..

We include shell injections under arbitrary code execution, and the majority of web attacks comprise cross-site scripting and CR/LF injection attacks. We classify vulnerabilities arising from accidental data exposure as direct data leaks. Authentication bypass vulnerabilities arise from system misconfiguration or credential verification bugs. A symlink attack allows an adversary to gain access to a resource via a specially crafted symbolic link.

While direct data leaks account for only about 10% of the reported vulnerabilities, we emphasize that most other attack classes, most notably arbitrary code execution, man-in-the-middle (MITM), and authentication bypass, lead to information leakage as well. Our analysis also demonstrates the diversity of vulnerable libraries, with a small number of libraries having a handful of reports in each attack class. The two Python packages with the most overall CVE reports are the widely used Django web framework and urllib* HTTP library family, each with eight reports. These findings underscore the degree to which IoT application developers are exposed to potential threats by importing third-party code.

In addition, we find that exploiting dynamic language features is fairly straightforward. In our lab setting, we built a Python module and a native library that use introspection or monkey patching [53] to replace function pointers at run time with malicious functions, to leak data at import time, and to perform various modifications to the contents of Python runtime’s stack frames.

While we have not identified such attacks in the wild, our experiments demonstrate that these dynamic features and open APIs place too much trust in third-party library developers, and can be misused for nefarious purposes. Thus, both dynamic language features and the capabilities of native libraries pose threats to the integrity of the application itself and the privacy of user data.

4 Threat Model and Security Goals

We aim to provide intra-process access control, which allows developers to prevent third-party code included in their IoT applications from leaking data. In particular, Pyronia protects sensitive OS resources, and restricts access to remote network destinations.

4.1 Threat Model

We assume that IoT device vendor, who usually also develop the device software, are trusted. As such, we also trust the underlying device hardware, the operating system, and the language runtime executing the IoT application. Yet, imported third-party code poses a risk to developers: library code is rarely inspected or readily accessible, so bugs or dynamic language features that leak sensitive data may go unnoticed. We do, however, assume that application developers do not intentionally include such vulnerable or malicious third-party code.

While data leak vulnerabilities take many forms, Pyronia targets third-party code that aims to access arbitrary sensitive files or devices, or exfiltrate sensitive data to arbitrary remote network destinations. Pyronia does not seek to prevent any control flow (e.g., ROP [47]) or side channel attacks (e.g., Spectre/Meltdown [35, 38]

, or physical vectors 

[23]). ROP defenses (e.g., [16, 63, 31, 15]) may be used in a complementary fashion. Pyronia also does not prevent network-based attacks such as man-in-the-middle or denial-of-service attacks.

4.2 Security Properties

Pyronia provides three main security properties.

P1: Least privilege. A third-party library function may only access those OS resources (i.e., files, devices, network) that are necessary to provide the expected functionality. Attempts by a third-party function to access resources that are not relevant to its functionality must be blocked. Pyronia conservatively enforces a default-deny policy, requiring developers to explicitly grant specific library functions access to OS resources.

P2: No confused deputies. All access control decisions are made based on the full provenance of the access request. This prevents confused deputy attacks [28] in which an underprivileged library function attempts to gain access to a protected resource via another function that does have sufficient privileges. To detect such attempts to bypass access control checks, Pyronia checks all functions involved in a request for an OS resource.

P3: Verified network destinations. A third-party library function may only transmit data to remote network destinations (e.g., cloud servers or other IoT devices) whitelisted by the developer. Thus, a third-party library cannot leak legitimately collected data to an untrusted remote server or device. Pyronia prevents such data exfiltration by intercepting all outgoing network connections.

Non-goals. While Pyronia automates access control at the level of in-application components, our design does not seek to provide automated execution isolation of these components (e.g., [59, 11, 10]). We also do not guarantee the correctness of the sensitive data they output. Automated code compartmentalization is complementary to our approach, and could be added to Pyronia to allow developers to prevent certain cross-library function calls. Ensuring the correctness of sensitive outputs, on the other hand, could provide additional data leak protection. However, formally verifying the functionality of untrusted code is beyond the scope of Pyronia, and could be performed separately prior to application deployment.

5 System Design

Pyronia enforces intra-process least privilege without partitioning an application into multiple processes, or propagating data flow labels.

Developers understand the purpose of imported libraries and can provide high-level descriptions of their expected data access behaviors. Pyronia thus relies on developers to specify all access rules in a single, central policy file. At run time, Pyronia loads this file into an application-specific access control list (ACL) that contains an entry for each developer-specified library function and its associated data access privileges. Pyronia imposes default-deny access control semantics, meaning that a third-party library function may only access those files, devices, and whitelisted remote network destinations.

To enforce this policy securely, Pyronia requires support both in the language runtime and from the OS. Figure 1 provides an overview of the Pyronia system architecture, and shows the main steps involved in a resource access request (see §5.1).

Figure 1: Overview of Pyronia, which enforces function-granular resource access policies via runtime and kernel modifications (striped boxes). New features are represented by gray boxes. The arrows show the components involved in an access request to a file such as a certificate.

5.1 Function-granular MAC

At first glance, performing access control at library function granularity in the language runtime may seem sufficient. The runtime can directly inspect its function call stack when a specific third-party library function uses the language’s high-level interface to access a file or device. If the a function with insufficient permissions attempts to access a resource, the runtime can block the request and throw an error to notify the application.

However, language runtimes also provide an interface to native code, such as Java’s JNI or Python’s C API; indeed, our analysis in §3.1 shows that use of this interface in Python is very common-place. This ability to include native code in otherwise memory-safe languages exposes applications to vulnerabilities in native code: as this code runs outside the purview of the language runtime, this code could bypass any runtime-level access control via direct system calls.

Pyronia addresses these issues via system call interposition (e.g., [25, 44, 34]) enhanced with function call provenance. Many mandatory access control (MAC) systems in deployment use system call interposition (e.g., SELinux [51], AppArmor [8], Windows MIC [40]). Their limitation, however, is that the security policy is only enforced at the process level. Pyronia achieves intra-process policy enforcement by incorporating runtime call stack inspection (e.g., [52, 62, 17, 60]) to obtain the full provenance of the system call.

With these two techniques, Pyronia provides function-granular MAC to enforce least privilege (P1). As we show in Fig. 1, when the application attempts to access a sensitive OS resource (e.g., an SSL certificate), the Pyronia Access Control module in the kernel intercepts the associated system call (step 1). This kernel module then sends a request to the language runtime via a trusted stack inspector thread, which pauses the runtime. After collecting the interpreter’s function call stack in its current state, the stack inspector returns the stack to the kernel (step 2).

1:procedure InspectStack(, )
2:     
3:     if  nil then return false      
4:     
5:     while  true do
6:         
7:         
8:         if  nil then
9:              continue          
10:               return
Algorithm 1 Call stack inspection

The Access Control module maintains the ACL for the developer-supplied policy. To determine whether to grant access to the requested OS resource, the module inspects the call stack to verify the provenance of the system call. Only if all developer-specified functions identified in the call stack have sufficient privileges may the application obtain data from the requested resource (step 3). That is, to determine the application’s access permissions to the requested resource, Pyronia dynamically computes the intersection of the privileges of each function in the call stack using algorithm 1. This algorithm prevents confused deputies (P2), much as in [52, 17, 20].

5.2 Runtime Call Stack Protection

Untrusted native libraries reside in the runtime’s address space, giving them unfettered access to the call stack’s memory region. A malicious native library may tamper with the runtime call stack in an attempt to bypass Pyronia’s function-level MAC.

This challenge is not unique to Pyronia; indeed, prior work in the mobile space [52, 62] recognized the need to protect the Dalvik call stack against native third-party code in the trusted host app’s address space. To address this issue, these proposals either rely on special hardware support [52] to separate the runtime address space from the native library address space, or they forgo memory protection altogether [62].

Pyronia, in contrast, aims to provide a more generally applicable solution to this issue, since IoT software runs on a very diverse range of hardware platforms. We overcome this challenge with page table replication, a technique that enables us to create strongly isolated memory regions, or memory domains, within a process’ address space. Prior work in this space (e.g., [61, 30, 39]) has introduced new primitives for intra-process execution compartments. Our design for Pyronia’s memory domains, on the other hand, focuses on data isolation.

To protect the language runtime call stack for a broad range of applications, memory domains in Pyronia meet two requirements: (1) the size of a domain must be flexible, and (2) the access privileges must be dynamic. The first requirement is important for ensuring that Pyronia can support applications that make an arbitrary number of nested function calls. The second requirement allows Pyronia to restrict an application’s access to a memory domain at run time based on the currently executing code (e.g., interpreter versus third-party library), while still enabling data sharing between application components.

The Pyronia Memory Domain Manager in the kernel (see Fig. 1) maintains a per-process table of domain-protected memory pages, and replicates the corresponding page table entries (PTEs); each replicated entry is associated with a distinct domain access policy. Policies are enforced at the granularity of individual native threads, and specify the access permissions for any thread launched under a given policy. Mapping thread contexts to a replicated PTE thus allows Pyronia to transparently change policy contexts during a context switch.

Upon a memory access, the Pyronia kernel performs all regular memory access checks. If the requested address is domain-protected, the Domain Manager additionally verifies that the loaded thread context has sufficient permissions to access the requested memory domain based on the thread’s policy. Any attempt by an application to access unauthorized domain-protected memory results in a memory fault.

To ensure the integrity of the runtime call stack, the Pyronia runtime allocates all call stack data into a memory domain called the stack domain. Pyronia currently defines two access policies to this domain: The runtime’s policy, which only allows access to the call stack during stack frame creation and deletion, and the stack inspector’s policy, which allows access to the call stack while responding to an upcall from the Access Control module. Pyronia loads the appropriate page table during context switches between the main runtime thread and the stack inspector thread automatically.

However, to enforce the runtime’s policy, Pyronia requires elevated access privileges to the stack domain during stack frame creation and deletion. Thus, the runtime invokes the Domain Manager to temporarily adjust the main runtime thread’s policy and corresponding PTE access bits during these operations. Otherwise, the language runtime still provides read access to the stack domain to allow code to make use of shared functions, but prohibits write access to protect this security-critical metadata.

We note that while our memory domain mechanism can support multi-threading beyond Pyronia’s threads, Pyronia currently only targets single-threaded applications. As we describe in §9, we leave this enhancement as future research due to the small fraction of IoT applications (10% per our analysis in §3.1) that spawn multiple threads.

5.3 Network Destination Verification

Since all IoT applications communicate with remote services, Pyronia must ensure that authorized third-party library functions may transmit data only to whitelisted destinations. That is, when an application attempts to export data via the network, Pyronia’s Access Control module intercepts all outward-facing socket system calls (e.g., bind() and connect()) for all socket types, i.e., TCP, UDP, and raw. As with other OS resource accesses, Pyronia then requests and inspects the runtime call stack.

However, network access privileges alone do not immediately allow a third-party function to transmit data. Pyronia also verifies the remote endpoint for the requested socket. Thus, only if the address of the requested destination is whitelisted for the given third-party function, does Pyronia grant access to the requested socket (P3).

5.4 Child Process Protection

We find in  §3.1 that a large fraction of applications (46%) and libraries (40%) run external binaries (including new language runtime instances) in subprocesses. This application characteristic poses a challenge to Pyronia’s function-granular MAC since child processes run in an independent context, losing important provenance information for system calls. Pyronia addresses this issue by ensuring the continued protection of child processes. Thus, upon a fork() system call, the Pyronia kernel propagates the parent’s process- and function-level ACLs, as well as the replicated page tables, to the child process.

However, the Access Control module does not request the current call stack from the language runtime at a fork() for three reasons. First, the parent process may be a native binary that is Pyronia-unaware. Second, since application developers may be unaware of subprocesses spawned by third-party code, using the call stack to block external binaries may break the functionality of the overall application. Third, using the parent runtime’s call stack for access control decisions in its children may also unduly restrict the functionality of the whole application. Thus, Pyronia requires language runtime sub-instances to register themselves with the Pyronia kernel to transparently enforce the developer’s function-level access policy, and to continue managing the stack domain in the child’s address space. In the case of a native child process, Pyronia does not provide intra-process access control, but still enforces the developer’s policy at process granularity.

6 Implementation

To demonstrate how Pyronia interacts with existing open-source IoT applications, we implemented the Pyronia kernel based on Linux Kernel version 4.8.0+, and the Pyronia runtime as a modified version of Python 2.7.14. We have released all components of our prototype on Github.333https://github.com/pyroniasys

6.1 Policy Specification

Rule type Format
FS resource <module>.<function name> <path to resource> <access privs>
Network destination <module>.<function name> network <IP addr/prefix>
Table 4: Pyronia data access rule specification formats. Supported access privileges are read-only , and read-write .

Application developers in Pyronia specify all function-level data access rules for file system resources, and network destinations in a single policy file. Table 4 details Pyronia’s policy rule specification format. In the case of network access rules specifically, our Pyronia prototype allows developers to specify IP address prefixes, as the specific address of a remote destination in a cloud service may not be known a priori.

Nevertheless, a challenge that arises from having limited knowledge about a library’s implementation is that it may legitimately require access to other resources that are unexpected. For example, a library function may need access to system fonts, or may write an intermediate result to the file system. Similarly, developers are unlikely to have a good sense of the system libraries or other file system locations a language runtime requires to operate properly. Thus, Pyronia’s default-deny access control semantics alone are too restrictive and may lead to a number of false negatives.

To maintain the functionality of the application and the language runtime, and reduce the number of false negatives, our prototype supports a special default access rule declaration, which grants application-wide access to a specified resource. While default access rules bypass Pyronia’s function-level access control, they still provide a baseline level of security as Pyronia also applies default-deny access control semantics at the process-level.

6.2 Pyronia Kernel

1:procedure CheckAccess(, )
2:      true
3:     
4:     if  true then return      
5:     
6:     
7:     if  null AND
8:      true then
9:         return      
10:     
11:     return
Algorithm 2 Pyronia in-kernel access control check

Access Control module. Our Access Control Module extends the AppArmor [8] kernel module version 2.11.0. AppArmor interposes on all system calls enforcing a process-level access policy. Thus, as in vanilla AppArmor, Pyronia denies access to a requested resource if the process does not have sufficient privileges. To add support for Pyronia’s stack inspection, we extend the process-level AppArmor policy data structure with a function-level ACL. This ACL is populated at application initialization (see §6.3), and contains an entry for each developer-specified library function. In addition, the Access Control module registers child processes of the main Pyronia process, propagating the application’s function-level ACL to all child processes.

In an early version of our prototype, Pyronia would inspect the runtime call stack on every intercepted system call. However, given that IoT applications are built to continuously gather and transmit data in an infinite loop, we improve the performance of Pyronia’s function-level MAC (by roughly 3x) by avoiding expensive kernel-userspace context switches for already-verified call stacks. Thus, we implement a stack logging mechanism, which stores and verifies the SHA256 hash for up to 16 functions authorized to access a given resource, as part of the function-level access control checks, outlined in Alg. 2.

The Access Control module first checks the defaults ACL for the application (line 3). If the requested resource is covered by a default rule, Pyronia grants access without inspecting the call stack. Otherwise, the Access Control module checks whether the the Python runtime has included a call stack hash along with the system call (line 5, more details in §6.3). If kernel received a call stack hash, and the received hash matches any of the hashes in the call stack log for the requested resource (line 8), the module grants access.

Otherwise, our prototype resorts to the full stack inspection mechanism (lines 9 and 10, see Alg. 1). Once the call stack has been inspected, and if the application has sufficient privileges to access the requested resource, the Access Control module logs the SHA256 hash of the callstack in the resource’s ACL. To support this mechanism, we made minor, backwards-compatible modifications to the the open() and connect() system call code in order to to parse the received call stack hash, if any, and store it for later verification during the MAC checks.

Memory Domain manager. For Pyronia’s page table replication, we modify the SMV [30] kernel module, a memory isolation proposal for enforcing per-page access control policies via thread-local page tables. Our Memory Domain manager leverages the SMV kernel API for maintaining a list of protected 4-kB domain pages and their corresponding access policies for the main runtime and stack inspector threads. Upon a fork(), the kernel copies the replicated page tables in the child process.

Netlink sockets. To enable communication between the kernel and the runtime in userspace, Pyronia uses a generic Netlink socket in the Domain manager, and one in the Access Control module. Netlink sockets offer two advantages: (1) they allow bi-directional communication between kernelspace and userspace obviating the need to implement additional ioctls() or system calls, and (2) userspace applications can use the POSIX socket API for Netlink communication.

6.3 Pyronia Python Runtime

To allow developers to run completely unmodified applications, the Pyronia runtime acts as a drop-in replacement for Python. We integrate our Pyronia library, which provides an API for loading the developer’s access policy, and for managing the stack memory domain.

Policy initialization. The runtime core uses our policy parser API to read the developer’s policy file during interpreter initialization. All parsed OS resource and network access rules are sent to the Access Control module in the kernel. Loading the policy before the runtime has loaded any third-party code has the advantage of preventing an adversary from “front-running” the interpreter by initializing the application’s in-kernel Pyronia ACL before the legitimate developer-supplied policy can be loaded. For this reason, the Pyronia runtime also spawns the stack inspector thread and registers it with the kernel during the initialization phase.

Stack domain allocation. As the userspace Pyronia memory domain management API acts as a drop-in replacement for malloc, we instrumented the Python stack frame manager to allocate new runtime call stack frames in the stack domain. Because write access to the stack domain is disabled by default, the runtime temporarily obtains write access to this domain during frame creation and deletion operations.

Child processes. Our Pyronia runtime provides continuous protection for child processes spawned via standard Python APIs (e.g., os.system()). As forking preserves the parent process’ memory in the child, Pyronia subprocesses automatically inherit the parent’s memory domain layout as well as the Pyronia interpreter metadata, including currently writable memory domains. Thus, Pyronia initialization in child processes only requires spawning the child’s stack inspector thread, and resetting the application’s access permissions to the stack domain disabling write access to this domain. This reset is necessary to ensure that the child cannot access any runtime stack frames in its own address space that the parent process had marked as writable at the time of forking.

Stack logging. As described in §6.2, we implement a stack logging mechanism to reduce the overhead of kernel upcalls for runtime call stack requests. When the language runtime requests a resource for the first time, the Pyronia performs the full stack inspection mechanism. If the kernel grants access to the requested resource, the Pyronia runtime logs the resource as authorized. Then, in subsequent system calls, our prototype first checks this log; if the requested resource has been logged, the runtime preemptively collects its current call stack before making the system call, computes the SHA256 hash, and embeds this hash into the input to the upcoming syscall.

To enable this optimization, we created system call-specific wrappers (via LD_PRELOAD) for various variants of open() and connect(). These wrappers perform the preemptive call stack collection, hash serialization, and authorized OS resource logging. These wrapper functions are backwards-compatible, and do not affect function-level policy specification.

Garbage collection. Object reference counting used for Python’s garbage collection poses a challenge to protecting the stack domain without breaking the functionality of the application. Specifically, we found that Python increments or decrements several objects’ reference counts, including those of stack frames, for practically every Python instruction and internal operation.

To address this issue, we temporarily elevate the interpreter’s permissions to the stack domain around those blocks of the Python runtime code that operate on domain-protected data. Yet, simply granting write access to the entire stack domain is inefficient, since it may cover hundreds of pages, each of whose page table entries would need to be modified (requiring TLB flushes at high rates).

We optimize these frequent domain access adjustments by tracking the addresses of stack domain pages in the Pyronia runtime, and only modifying the access privileges to a specific page when needed. One exception to this is for new stack frame allocations: since the runtime is creating a new buffer with an undetermined memory address, our prototype enables write access to all domain pages with free memory chunks.

7 Pyronia Protection in Real Applications

To examine the effectiveness of Pyronia’s policies and protections in real applications, we conduct three in-depth case studies of Python applications that specifically capture a range of common IoT use cases, and import a variety of Python libraries. Our goal is to answer the following questions about the usability and security of Pyronia:
§7.2 How difficult is it to write function-level access policies for Pyronia?
§7.3 What reported Python library vulnerability classes can Pyronia mitigate?
§7.4 What effect do dynamic language features of Python have on Pyronia’s protections?

7.1 Case Studies

We evaluate three open-source Python IoT applications that represent the main categories of applications we studied: visual, audio, and environmental sensing. Each of these applications communicates with a cloud service for data processing or storage, which required that we register an account to obtain authentication credentials. The imported libraries in our study implement a broad range of common IoT functionalities (see Table 5). Specifically, our goal is to study how Pyronia operates for common IoT authentication mechanisms, data processing techniques, and communication protocols.

Through manual inspection of the source code of each case study, we found four distinct direct third-party imports, and three standard libraries. While small in number, these direct imports are among top 50 third-party imports in our analysis in §3.1.

App imports Highlights
twitterPhoto tweepy integrated OAuth
alexa json data marshalling
memcache raw sockets
re regex parsing
requests HTTP API
plant_watering paho-mqtt MQTT, binary exec
ssl crypto, native deps
Table 5: Summary of case study applications.

twitterPhoto takes an image from a connected camera every 15 minutes, and sends the picture along with a short message to a specified Twitter account. Before sending the tweet, the application authenticates itself to Twitter via OAuth. The tweepy library is used both to authenticate the app and upload the message.

alexa provides an open-source Python implementation of an Amazon Echo smart speaker. This application records audio via a microphone while a button is pressed, and sends the recorded audio (along with authentication credentials) to the Alexa Voice Service (AVS) [7] for processing. The AVS sends an audio response, if the recorded data is one of the commands recognized by the service, for the alexa application to play back. Otherwise, the AVS responds with an empty ACK message. The python-memcache library is used to cache the app’s AVS access token, and the requests library for communicating with the AVS via HTTP. This app also uses the json library to format all messages exchanged with AVS, and the re library to parse out any audio file contained in the AVS response. To facilitate our security and performance tests, we removed the button press for audio recording, and instead open a pre-recorded audio file.

plant_watering records moisture sensor readings once a minute, and sends them to the Amazon AWS IoT [5] service via MQTT [2], a widely used IoT communication protocol. MQTT also handles client authentication with the AWS IoT service via TLS. We replaced sensor readings with a randomly generated value to facilitate testing, and replaced the original MQTT library with one that supports single-threaded network communications.

7.2 Specifying Function-Level Policies

To understand whether Pyronia’s policy specification places an undue burden on developers, we analyzed the policy specification process for our three case studies.

As we describe in §6, one challenge to specifying comprehensive rules for a MAC system that enforces default-deny semantics is reducing the number of false negatives. In running our case studies, we found that the vast majority of false negatives would arise primarily due to Python’s internal library loading and DNS resolution processes.

To examine this challenge, we ran each case study application, in Pyronia under AppArmor in complain mode: we identified a total of 42 common files that need to be read-accessible, plus 4 network protocols, for the Python interpreter to run unimpeded, and for the applications to connect to remote network destinations.

In addition, each application required explicit access to its parent directory as well as all imported Python modules contained within, which corresponds to up to 6 additional access rules for the alexa application. Since we cannot expect developers to manually specify around 50 access rules needed by their application by default, we developed a policy generation tool that creates an access policy template pre-populated with rules for the 46 common files and network protocols. To further ease policy specification, our policy generation tool lists all files in the application’s parent directory and adds rules for all identified files. Developers may then manually inspect the template and modify any rules that are function-specific.

The majority of the manually added access rules for all three applications are multiple network destination rules for the same library function. Pyronia does not currently support domain name-based access rules as this feature requires that all domain names be resolved a priori for Pyronia’s IP-address based network destination verification. Nonetheless, our case studies require no more than 11 function-level network destination rules in the plant_watering case.

7.3 Vulnerability Analysis

To understand how effective Pyronia’s protections are against security vulnerabilities and exploits, we study Pyronia’s ability to mitigate specific instances of reported Python library vulnerabilities (recall §3.2). We emphasize that all of our analyzed vulnerabilities could affect any IoT use case, so our choice for testing a particular vulnerability in a specific application does not reflect prevalence of a vulnerability class in that IoT use case.

Since most of the reported vulnerabilities do not affect the libraries in our case studies, we replicate all analyzed vulnerabilities in a specially crafted adversarial library targeting our case studies. We then call individual functions in our library from our case study applications.

We place the 9 reported attack classes into three broad categories: (1) successfully mitigated vulnerabilities, (2) case-dependent for vulnerabilities that Pyronia may mitigate in some instances, and (3) beyond scope for attacks that fall outside Pyronia’s threat model.

Direct data leaks. We analyze three distinct instances of OS resource-based data leaks, in which code with insufficient privileges attempts to gain access to a sensitive OS resource. To this end, we craft two adversarial functions, which upload an SSH private key instead of the authorized photo file using the legitimate tweepy library call, and which upload the authorized photo to an unauthorized remote server, respectively. In addition, we also replicate the data leak bug reported in CVE-2019-9948 [1], in which the Python urllib HTTP library exposes an unsafe API that allows arbitrary local file opens.

We test these vulnerabilities using the twitterPhoto app, and find that Pyronia can successfully mitigate these problems. However, we classify direct data leaks mitigation as case-dependent because several reported data leak vulnerabilities arise due to in-memory data exposures. In contrast, Pyronia currently only ensures the protection of the runtime call stack memory, but does not isolate sensitive application-level in-memory data objects, or protect against control-flow attacks.

Symlink attacks. A small number of reported vulnerabilities in Python libraries comprises symlink attacks, in which an adversary attempts to access an unauthorized file via a specially crafted symbolic link. We analyze this attack by crafting an adversarial function that attempts to open plant_watering’s private key through a symlink in the \tmp directory. Since our prototype follows all accessed symbolic links to their source, Pyronia detects the attempt to access an unauthorized file successfully mitigating this type of attack.

Arbitrary code execution. A number of reported Python library vulnerabilities pertain to shell injection made possible due to unsanitized input. To examine this attack class, we crafted an adversarial library function that attempts to directly exec a shell command as part of the plant_watering moisture sensor reading (i.e., random number generator) call.

Pyronia successfully mitigates the exec as the implicit file open is blocked. Thus, we believe that our analysis demonstrates that Pyronia could be effective in mitigating all instances of such shell injection attacks as they all ultimately require access to an unauthorized executable binary file. On the other hand, Pyronia does not mitigate buffer overflow-based arbitrary code execution attacks. Further research is necessary to determine if Pyronia’s memory domains could mitigate these attacks.

Beyond scope vulnerability classes. A large portion of reported Python library vulnerabilities are beyond the scope of Pyronia’s protections. The majority of MITM vulnerabilities in our CVE reports analysis stem from a failure to verify TLS certificates. The weak crypto vulnerabilities in our survey primarily arise because a known-weak algorithm or random number generator was used, or because input was not properly validated. Similarly, Pyronia cannot prevent replay or data spoofing attacks as these stem from improper input (i.e., nonce or filename) validation as well. Most of the authentication bypass bugs occur in larger Python-based frameworks that fail to properly implement authentication procedures.

Interestingly, the majority of the reported DoS attacks in Python libraries stem from improper input handling or memory management that can cause the application to crash. While Pyronia does not verify the correctness of application or library code, we see an avenue for using Pyronia to prevent network-based DoS attacks.

7.4 Dynamic Language Features

Prior proposals have recognized the potential security threat posed by dynamic language features such as reflection and native code execution [52, 59]. As we describe in §3.2, Python’s dynamic features enabled us to replace function pointers (aka monkey patching), leak a sensitive file at import time, and modify contents of Python stack frames including the value of function arguments. To understand how these dynamic language features affect Pyronia’s protections, we analyzed these three cases.

Much as in the direct data leak attack analysis, we found that Pyronia readily prevented unauthorized file accesses at import time. However, we met several challenges to Pyronia’s ability to prevent all forms of stack frame tampering and monkey patching. We found that Pyronia’s stack memory domain prevents native code from directly accessing arbitrary stack frame memory. Yet, because Python stores the local variables for each stack frame in a separate dictionary data structure, pointed to by the stack frame, our implemented stack frame isolation is insufficient to prevent tampering with function arguments by native code or monkey patching. As part of ongoing research, we are exploring more robust countermeasures to these dynamic language features.

8 Evaluation

To evaluate the performance of Pyronia, we ran our three case studies (§7.1) in vanilla Python and Pyronia Python, measuring the execution time and memory overheads. We also took microbenchmarks of the main Pyronia operations, as well as common system calls used in IoT applications to analyze the impact of Pyronia’s call stack inspection-based access control.

Our testing system is an Ubuntu 18.04 LTS virtual machine running our instrumented Pyronia Linux Kernel on a single Intel Core i7-3770 CPU, with 1.95 GB of RAM. Though not a dedicated IoT platform, our test VM’s configuration is comparable to recent single-board computers targeting IoT, such as the Raspberry Pi 4 [46] or the NVIDIA Jetson Nano [43].

To ensure that the results of our evaluation are consistent, we make minor modifications to our case study applications replacing their real-time data collection (e.g., reading an image from a camera) with a static data source (e.g., an image file), and run the applications for a finite number of iterations. We emphasize that none of these modifications affected policy specification, or were needed to add support for Pyronia’s security mechanisms.

8.1 Execution Time Overhead

To analyze the impact of Pyronia on application execution time, we measured 25 runs of the end-to-end execution time for a single iteration, as well as the per-iteration execution time over 100 iterations. The measurement for a single iteration of the application represents the worst-case scenario as it includes any overhead due to Pyronia initialization (and teardown), and the kernel’s call stack log is empty. Measuring the per-iteration execution time, on the other hand, gives an estimate of the long-term operating time of the application, and the overhead due to Pyronia’s run-time security checks, i.e., call stack inspection and stack domain-related operations.

While Pyronia’s mean end-to-end execution time overhead of 2-5x is significant, the mean long-term overhead per iteration is reduced to 2-3x. Nonetheless, in absolute terms, the worst-case execution time for a single app iteration under Pyronia is 1.5 seconds for the twitterPhoto app, which we expect would remain largely imperceptible to end users in real-world deployments.

Figure 2: Mean per-iteration execution time in seconds for each application with and without Pyronia enabled.
Figure 3: Mean execution time for open, fopen and socket connect system calls across all tested applications.

Figure 2 plots the mean per-iteration execution time over 5 runs of 100 iterations of our tested applications. Despite the overall execution time overhead of Pyronia, we observe that stack logging lowers the long-term overhead of the plant_watering and twitterPhoto apps, as the execution time mostly levels after about 5 iterations.

Nonetheless, stack logging seems to play a little role in reducing the execution time overhead for the alexa app. For an application that requires active user involvement, Pyronia’s long-term overhead of 2x for the alexa app is likely unacceptable in a real-world deployment, even with an absolute per-iteration execution time of under one-eighth of a second (113.3 ms).

Pyronia operation microbenchmarks. Measurements of nine key Pyronia operations show that stack domain dynamic permissions adjustments greatly dominate the overall Pyronia overhead, with millions of domain access grant/revoke calls in a single iteration in all applications. By comparison, the median number of all stack-related operations, i.e., stack collection and hashing, is only in the teens. Thus, we attribute the main source of Pyronia’s run-time overheads to stack domain access grant calls recorded in our experiments.

Access control overhead. To further characterize the performance costs due to Pyronia’s access control checks in the kernel, we ran microbenchmarks of the libc open(), fopen() and connect() (and their 64-bit variants), for which we have implemented our stack logging optimization. Figure 3 shows the mean execution time over 25 runs of a single iteration of our tested apps. Our results show that Pyronia’s system call interposition imposes at most a 2x overhead for the open() system call.

Summary. Pyronia’s execution time overhead is not trivial, despite our performance optimizations. While some additional time is spent during each system call, the main slowdown occurs due to dynamic memory domain page access adjustments. Nonetheless, because the majority of IoT applications run on devices only passively collecting and transmitting data, we expect these overheads would go largely unnoticed by end users. We plan to investigate further performance optimizations of memory domain access adjustments, especially for interactive IoT applications, as part of future work.

8.2 Memory Overhead

Pyronia imposes memory overhead due to the creation and management of the stack memory domain. To evaluate the impact of this domain on userspace memory consumption, we first measure the userspace per-domain page metadata allocations, i.e., the memory required for the Pyronia runtime to maintain each domain page and the associated memory management data structures for the stack domain. 444We do not evaluate the actual data allocation overhead per domain page as Pyronia does not change the amount of data the runtime allocates, only where in the runtime’s address space this data is placed.

Our analysis shows that the mean per-domain page metadata memory usage for all tested applications is between 0.31 and 0.38 KB; the fact that the page metadata allocations varies this little across all tested applications demonstrates that the majority of domain pages contain a similar number of allocated blocks (i.e., runtime stack frames), regardless of the total number of allocated domain pages. Table 6 further shows that the median memory usage of the whole Pyronia subs-system in the Python runtime remains under 200 KB, even for applications with over 100 stack domain pages. These results are a strong indication that Pyronia’s domain memory consumption scales linearly with the number of allocated domain pages.

# dom pages Pyronia total
twitterPhoto 176 151.2 KB
alexa 141 147.9 KB
plant_watering 54 68.4 KB
Table 6: Mean Pyronia domain metadata memory usage.
peak usage (in MB) overhead
twitterPhoto 40.9 12.9%
alexa 33.9 33.0%
plant_watering 26.4 70.0%
Table 7: Peak memory overhead under Pyronia.

Furthermore, Pyronia’s memory domains have a small impact on the overall memory consumption of our tested applications. Table 7 shows the median peak memory usage and overhead over 5 100-iteration runs. For the twitterPhoto application with a peak memory usage of about 40 MB, Pyronia’s memory overhead is only 12.9%, even with the largest number stack domain pages.

Summary. Pyronia incurs low memory overhead, even for IoT applications that allocate over 100 domain pages. For instance, domain metadata only consumes a total of 151 KB for the twitterPhoto application with 176 allocated domain pages. For applications with a greater number of domain pages, our results indicate that the metadata memory overhead would likely grow linearly. While the increase to application-wide memory usage is the highest for the plant_watering application at 70%, a peak memory consumption of under 30 MB is still rather modest. Therefore, Pyronia’s memory overhead would not place an excessive burden on IoT devices with more constrained resources than our testing system.

9 Discussion

Multi-threading. While Pyronia currently targets single-threaded IoT applications, we discovered in §3.1 as well as during our experiments that a small number of IoT applications and libraries (about 10%) spawn pthreads. This programming pattern introduces one key security challenge: since threads execute independently of the main thread. In other words, a vulnerability could still cause a confused deputy attack (violating P2).

To address this issue, upon pthread_create() or thread_start() calls, Pyronia could automatically save the state of the “parent” call stack, so as to provide the Access Control module with the full provenance when the child thread makes a system call. Nonetheless, accurately mapping “parent” stacks to child stacks, especially in the scenario of nested multi-threading, would be an additional design challenge.

Improving policy specification. As we discuss in §7.2, Pyronia aims to reduce the burden of defining fine-grained access policies, and lower false negatives. However, due to our reliance on AppArmor, Pyronia currently expects path-based access rules, which are often difficult to determine for resources such as sensors.

Designing a more developer-friendly and rigorous policy specification model is beyond the scope of Pyronia. One interesting approach may be to support simple mobile-style resource access capabilities (e.g., READ_CAM), which Pyronia could then automatically map to the corresponding low-level system resources.

More short-term improvements include adding support for domain-based network whitelisting, and maintaining a list of the most critical default rules within the kernel, allowing developers to remain agnostic to the runtime defaults. The number of required rules could be further reduced by adding support for rule grouping. That is, for resources accessible by multiple functions with the same privileges, Pyronia could support allowing developers to express these policies as a single rule.

While we believe that the risks of completely automated policy generation (e.g., as in [44, 11]) outweigh the benefits, we see an opportunity for the library developer community to ease the policy specification process further. For instance, library developers could contribute resource “manifests”, i.e., a list of required files and network destinations, and package these manifests along with their source code or binaries. With support from Pyronia, application developers could then automatically load these manifests as part of their application-specific access policy, allowing application developers to focus on their high-level policy.

10 Conclusion

We have presented Pyronia, an intra-process access control system for IoT device applications written in high-level languages. Pyronia enforces function-granular MAC of third-party code via a three-pronged approach: system call interposition, stack inspection, and memory domains. Unlike prior approaches, Pyronia runs unmodified applications, and does not require unintuitive policy specification. We implement a Pyronia kernel and Pyronia Python runtime. Our evaluation of three open-source Python IoT applications demonstrates that Pyronia mitigates OS resource-based data leak vulnerabilities, and shows that Pyronia’s performance overheads are acceptable for the most common types of IoT applications.

References

  • [1] CVE-2019-9948. Retrieved Aug. 2 2019, from https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-9948.
  • [2] MQTT.org. Retrieved Feb. 9, 2019, from https://mqtt.org.
  • [3] Y. Acar, M. Backes, S. Fahl, D. Kim, M. L. Mazurek, and C. Stransky. You Get Where You’re Looking for: The Impact of Information Sources on Code Security. In Proc. IEEE Symposium on Security and Privacy, May 2016.
  • [4] Y. Acar, S. Fahl, and M. L. Mazurek. You are Not Your Developer, Either: A Research Agenda for Usable Security and Privacy Research Beyond End Users. In Proc. IEEE Cybersecurity Development, pages 3–8, 11 2016.
  • [5] Amazon Web Services, Inc. AWS IoT. Retrieved Feb. 9, 2019, from https://aws.amazon.com/iot/.
  • [6] Amazon Web Services, Inc. Security and identity for AWS IoT. AWS IoT Devloper Guide.
  • [7] Amazon.com, Inc. Alexa Voice Service. Retrieved Feb. 9, 2019, from https://developer.amazon.com/alexa-voice-service.
  • [8] AppArmor maintainers. AppArmor security project wiki. Accessed Dec. 30, 2018, https://gitlab.com/apparmor/apparmor/wikis/home/.
  • [9] R. Balebako and L. Cranor. Improving App Privacy: Nudging App Developers to Protect User Privacy. In Proc. IEEE Symposium on Security and Privacy, July 2014.
  • [10] A. Bittau, P. Marchenko, M. Handley, and B. Karp. Wedge: Splitting Applications into Reduced-Privilege Compartments. In NSDI, 2008.
  • [11] A. Blankstein and M. J. Freedman. Automating Isolation and Least Privilege in Web Services. In IEEE Symposium on Security and Privacy, 2014.
  • [12] J. Bort. For The First Time, Hackers Have Used A Refrigerator To Attack Businesses. Business Insider, Jan. 2014.
  • [13] Y. Chen, S. Reymondjohnson, Z. Sun, and L. Lu. Shreds: Fine-Grained Execution Units with Private Memory. In IEEE Symposium on Security and Privacy (S&P), 2016.
  • [14] W. Cheng, D. R. K. Ports, D. Schultz, V. Popic, A. Blankstein, J. Cowling, D. Curtis, L. Shrira, and B. Liskov. Abstractions for usable information flow control in aeolus. In ATC, 2012.
  • [15] Y. Cheng, Z. Zhou, M. Yu, D. Xuhua, and R. Deng. ROPecker: A Generic and Practical Approach For Defending Against ROP Attacks. In NDSS, 2014.
  • [16] L. Davi, A.-R. Sadeghi, and M. Winandy. ROPdefender: A Detection Tool to Defend Against Return-oriented Programming Attacks. In Proc. ACM Symposium on Information, Computer and Communications Security, 2011.
  • [17] M. Dietz, S. Shekhar, Y. Pisetsky, A. Shu, and D. S. Wallach. Quire: Lightweight Provenance for Smart Phone Operating Systems. In USENIX Security Symposium, 2011.
  • [18] Eclipse Foundation. IoT Developer Surveys. https://iot.eclipse.org/iot-developer-surveys/, 2019. Accessed 3 Nov 2019.
  • [19] P. Efstathopoulos, M. Krohn, S. VanDeBogart, C. Frey, D. Ziegler, E. Kohler, D. Mazières, F. Kaashoek, and R. Morris. Labels and event processes in the asbestos operating system. In SOSP, 2005.
  • [20] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna, and E. Chin. Permission re-delegation: Attacks and defenses. In USENIX Security Symposium, 2011.
  • [21] E. Fernandes, J. Paupore, A. Rahmati, D. Simionato, M. Conti, and A. Prakash. FlowFence: Practical Data Protection for Emerging IoT Application Frameworks. In USENIX Security Symposium. USENIX Association, 2016.
  • [22] S. Gallagher. How one rent-a-botnet army of cameras, DVRs caused Internet chaos. Ars Technica, Oct. 2016.
  • [23] D. Genkin, L. Pachmanov, I. Pipman, E. Tromer, and Y. Yarom. ECDSA Key Extraction from Mobile Devices via Nonintrusive Physical Side Channels. In Proc. ACM SIGSAC Conference on Computer and Communications Security, 2016.
  • [24] D. B. Giffin, A. Levy, D. Stefan, A. Russo, D. Terei, D. Mazières, and J. C. Mitchell. Hails: Protecting data privacy in untrusted web applications. In OSDI, 2012.
  • [25] I. Goldberg, D. Wagner, R. Thomas, and E. Brewer. A secure environment for untrusted helper applications (confining the wily hacker). In USENIX Security Symposium, 1996.
  • [26] D. Goodin.

    9 baby monitors wide open to hacks that expose users’ most private moments.

    Ars Technica, Sep. 2015.
  • [27] A. Greenberg. Hackers Remotely Kill a Jeep on the Highway – With Me in It. Wired, Jul. 2015.
  • [28] N. Hardy. The confused deputy: (or why capabilities might have been invented). ACM Operating Systems Review, 22(4), 1988.
  • [29] W. He, M. Golla, R. Padhi, J. Ofek, M. Dürmuth, E. Fernandes, and B. Ur. Rethinking access control and authentication for the home internet of things (iot). In Proc. USENIX Security Symposium, 2018.
  • [30] T. C.-H. Hsu, K. Hoffman, P. Eugster, and M. Payer. Enforcing Least Privilege Memory Views for Multithreaded Applications. In CCS, 2016.
  • [31] H. Hu, C. Qian, C. Yagemann, S. P. H. Chung, W. R. Harris, T. Kim, and W. Lee. Enforcing Unique Code Target Property for Control-Flow Integrity. In Proc. CCS, 2018.
  • [32] Intel Corporation. Intel IoT platform reference architecture white paper.
  • [33] Y. J. Jia, Q. A. Chen, S. Wang, A. Rahmati, E. Fernandes, Z. M. Mao, and A. Prakash. Contexlot: Towards providing contextual integrity to appified iot platforms. In NDSS. Internet Society, 2017.
  • [34] M. B. Jones. Interposition agents: transparently interposing user code at the system interface. In SOSP, 1993.
  • [35] P. Kocher, J. Horn, A. Fogh, , D. Genkin, D. Gruss, W. Haas, M. Hamburg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and Y. Yarom. Spectre attacks: Exploiting speculative execution. In 40th IEEE Symposium on Security and Privacy (S&P’19), 2019.
  • [36] M. Krohn, A. Yip, M. Brodsky, N. Cliffer, M. F. Kaashoek, E. Kohler, and R. Morris. Information flow control for standard os abstractions. In SOSP, 2007.
  • [37] S. Lee, J. Choi, J. Kim, B. Cho, S. Lee, H. Kim, and J. Kim. FACT: Functionality-centric Access Control System for IoT Programming Frameworks. In Proc. ACM Symposium on Access Control Models and Technologies, 2017.
  • [38] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, A. Fogh, J. Horn, S. Mangard, P. Kocher, D. Genkin, Y. Yarom, and M. Hamburg. Meltdown: Reading kernel memory from user space. In 27th USENIX Security Symposium (USENIX Security 18), 2018.
  • [39] J. Litton, A. Vahldiek-Oberwagner, E. Elnikety, D. Garg, B. Bhattacharjee, and P. Druschel. Light-Weight Contexts: An OS Abstraction for Safety and Performance. In OSDI, 2016.
  • [40] Microsoft Windows Dev Center. Mandatory Integrity Control. Accessed Feb. 11, 2019, https://docs.microsoft.com/en-us/windows/desktop/SecAuthZ/mandatory-integrity-control.
  • [41] A. C. Myers and B. Liskov. Protecting privacy using the decentralized label model. ACM Transactions on Software Engineering and Methodology, 9(4), 2000.
  • [42] Nest Labs. Keeping data safe at Nest.
  • [43] Nvidia Corporation. Jetson Nano. Retrieved Jul. 24, 2019, from https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-nano/.
  • [44] N. Provos. Improving host security with system call policies. In USENIX Security Symposium, 2003.
  • [45] A. Rahmati, E. Fernandes, K. Eykholt, and A. Prakash. Tyche: A Risk-Based Permission Model for Smart Homes. In IEEE Cybersecurity Development (SecDev), 2018.
  • [46] Raspberry Pi Foundation. Raspberry Pi 4 Tech Specs. Retrieved Jul. 24, 2019, from https://www.raspberrypi.org/products/raspberry-pi-4-model-b/specifications/.
  • [47] R. Roemer, E. Buchanan, H. Shacham, and S. Savage. Return-oriented programming: Systems, languages, and applications. ACM Trans. Inf. Syst. Secur., 15(1), Mar. 2012.
  • [48] J. H. Saltzer and M. D. Schroeder. The protection of information in computer systems. Proceedings of the IEEE, 63(9), 1975.
  • [49] SAMSUNG. Device Protection and Trusted Code Execution.
  • [50] R. Schuster, V. Shmatikov, and E. Tromer. Situational access control in the internet of things. In Proc. ACM SIGSAC Conference on Computer and Communications Security, 2018.
  • [51] SELinux maintainers. SELinux Project wiki. Accessed Feb. 11, 2019, https://selinuxproject.org/page/Main_Page.
  • [52] J. Seo, D. Kim, D. Cho, T. Kim, I. Shin, and X. Jiang. FlexDroid: Enforcing In-App Privilege Separation in Android. In NDSS, Feb. 2016.
  • [53] S. Shankar. Monkey Patching in Python: Explained with Examples. Retrieved 11 Nov, 2019, from https://thecodebits.com/monkey-patching-in-python-explained-with-examples/.
  • [54] S. Shekhar, M. Dietz, and D. S. Wallach. AdSplit: Separating smartphone advertising from applications. In USENIX Security Symposium. USENIX Association, 2012.
  • [55] D. Stefan, E. Z. Yang, P. Marchenko, A. Russo, D. Herman, B. Karp, and D. Mazières. Protecting users by confining javascript with cowl. In OSDI, 2014.
  • [56] Symantec Security Response. IoT devices being increasingly used for DDoS attacks. Symantec Official Blog, Sep. 2016.
  • [57] The MITRE Corporation. Common Vulnerabilities and Exposures (CVE) List. https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=python. Accessed 10 Aug 2019.
  • [58] Y. Tian, N. Zhang, Y.-H. Lin, X. Wang, B. Ur, X. Guo, and P. Tague. SmartAuth: User-centered Authorization for the Internet of Things. In Proc. USENIX Security Symposium, 2017.
  • [59] N. Vasilakis, B. Karel, N. Roessler, N. Dautenhahn, A. Dehon, and J. Smith. BreakApp: Automated, Flexible Application Compartmentalization. In NDSS, 2018.
  • [60] D. S. Wallach and E. W. Felten. Understanding Java Stack Inspection. In IEEE S&P, 1998.
  • [61] J. Wang, X. Xiong, and P. Liu. Between Mutual Trust and Mutual Distrust: Practical Fine-grained Privilege Separation in Multithreaded Applications. In USENIX ATC), 2015.
  • [62] Y. Wang, S. Hariharan, C. Zhao, J. Liu, and W. Du. Compac: Enforce component-level access control in android. In Conference on Data and Application Security and Privacy (CODASPY), 2014.
  • [63] R. Wartell, V. Mohan, K. W. Hamlen, and Z. Lin. Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code. In Proc. CCS, 2012.
  • [64] Y. Wu, S. Sathyanarayan, R. H. Yap, and Z. Liang. Codejail: Application-transparent Isolation of Libraries with Tight Program Interactions. In ESORICS, 2012.
  • [65] A. Yip, X. W. andNickolai Zeldovich, and M. F. Kaashoek. Improving application security with data flow assertions. In SOSP, 2009.
  • [66] A. R. Yumerefendi, B. Mickle, and L. P. Cox. Tightlip: Keeping applications from spilling the beans. In NSDI, 2007.
  • [67] N. Zeldovich, S. Boyd-Wickizer, E. Kohler, and D. Mazières. Making information flow explicit in histar. In OSDI, 2006.
  • [68] K. Zetter. It’s Insanely Easy to Hack Hospital Equipment. Wired, Apr. 2014.

Appendix A CVE Reports for Python Libraries

Our analysis of reported Python library vulnerabilities in §3.2 covers Common Vulnerabilities and Exposures (CVE) reports made between January 2012 and March 2019. We found 123 reports for a total of 78 different Python libraries and frameworks in this seven-year time frame, and we identified nine main attack classes. Table 8 shows the attack class and affected Python library or framework for each analyzed CVE report.

CVE Report Vulnerability Class Affected Library/Framework
CVE-2019-9948 Direct data leak urllib*
CVE-2019-9947 Web attack urllib*
CVE-2019-9740 Web attack urllib*
CVE-2019-7537 Arbitrary code execution donfig
CVE-2019-6690 Direct data leak python-gnupg
CVE-2019-5729 MITM splunk-sdk
CVE-2019-3575 Arbitrary code execution sqla-yaml-fixtures
CVE-2019-3558 DoS Facebook Thrift
CVE-2019-2435 Direct data leak Oracle MySQL Connectors
CVE-2019-13611 Web attack python-engineio
CVE-2019-12761 Arbitrary code execution pyXDG
CVE-2019-11324 MITM urllib*
CVE-2019-11236 Web attack urllib*
CVE-2018-5773 Web attack python-markdown2
CVE-2018-20325 Arbitrary code execution definitions
CVE-2018-18074 Direct data leak requests
CVE-2018-17175 Direct data leak marshmallow
CVE-2018-10903 Direct data leak python-cryptography
CVE-2018-1000808 DoS pyopenssl
CVE-2018-1000807 DoS pyopenssl
CVE-2017-9807 Arbitrary code execution OpenWebif
CVE-2017-7235 Arbitrary code execution cloudflare-scrape
CVE-2017-3590 Authentication bypass MySQL
CVE-2017-2809 Arbitrary code execution ansible-vault
CVE-2017-2809 Arbitrary code execution tablib
CVE-2017-2592 Direct data leak python-oslo-middleware
CVE-2017-16764 Arbitrary code execution Django
CVE-2017-16763 Arbitrary code execution Confire
CVE-2017-16618 Arbitrary code execution OwlMixin
CVE-2017-16616 Arbitrary code execution PyAnyAPI
CVE-2017-16615 Arbitrary code execution MLAlchemy
CVE-2017-1002150 Web attack python-fedora
CVE-2017-1000433 Authentication bypass pysaml
CVE-2017-1000246 Weak crypto pysaml
CVE-2017-0906 Web attack recurly
CVE-2016-9910 Web attack html5lib
CVE-2016-9909 Web attack html5lib
CVE-2016-9015 MITM urllib*
CVE-2016-7036 Weak crypto python-jose
CVE-2016-5851 Web attack python-docx
CVE-2016-5699 Web attack urllib*
CVE-2016-5598 Direct data leak MySQL
CVE-2016-4972 Arbitrary code execution python-muranoclient
CVE-2016-2533 DoS PIL
CVE-2016-2166 Weak crypto Apache QPid Proton
CVE-2016-1494 Weak crypto python-rsa
CVE-2016-0772 Weak crypto smtplib
CVE-2015-7546 Authentication bypass python-keystoneclient
CVE-2015-5306 Arbitrary code execution ironic-inspector
CVE-2015-5242 Arbitrary code execution swiftonfile
CVE-2015-5159 DoS python-kdcproxy
CVE-2015-3220 DoS tlslite
CVE-2015-3206 DoS python-kerberos
CVE-2015-2674 MITM restkit
CVE-2015-2316 DoS Django
CVE-2015-1852 MITM python-keystoneclient
CVE-2015-1326 Arbitrary code execution python-dbusmock
CVE-2014-9365 MITM urllib*
CVE-2014-8165 Arbitrary code execution powerpc-utils-python
CVE-2014-7144 MITM python-keystoneclient
CVE-2014-4616 Direct data leak simplejson
CVE-2014-3995 Web attack Django
CVE-2014-3994 Web attack Django
CVE-2014-3598 DoS PIL
CVE-2014-3589 DoS PIL
CVE-2014-3539 Arbitrary code execution rope
CVE-2014-3146 Web attack lxml
CVE-2014-3137 Authentication bypass Bottle
CVE-2014-3007 Arbitrary code execution PIL
CVE-2014-1934 Symlink attack eyeD3
CVE-2014-1933 Symlink attack PIL
CVE-2014-1932 Symlink attack PIL
CVE-2014-1929 Arbitrary code execution python-gnupg
CVE-2014-1928 Arbitrary code execution python-gnupg
CVE-2014-1927 Arbitrary code execution python-gnupg
CVE-2014-1839 Symlink attack logilab-common
CVE-2014-1838 Symlink attack logilab-common
CVE-2014-1830 Direct data leak requests
CVE-2014-1829 Direct data leak requests
CVE-2014-1624 Symlink attack python-xdg
CVE-2014-1604 Data spoofing python-rply
CVE-2014-0472 Arbitrary code execution Django
CVE-2014-0105 Authentication bypass python-keystoneclient
CVE-2013-7459 Arbitrary code execution PyCrypto
CVE-2013-7440 MITM ssl
CVE-2013-7323 Arbitrary code execution python-gnupg
CVE-2013-6491 MITM python-qpid
CVE-2013-6444 MITM pyWBEM
CVE-2013-6418 MITM pyWBEM
CVE-2013-6396 MITM python-swiftclient
CVE-2013-4482 Authentication bypass python-paste-script
CVE-2013-4347 Weak crypto python-oauth2
CVE-2013-4346 Auth token replay attack python-oauth2
CVE-2013-4238 MITM ssl
CVE-2013-4111 MITM python-glanceclient
CVE-2013-2191 MITM python-bugzilla
CVE-2013-2132 DoS pymongo
CVE-2013-2131 DoS python-rrdtool
CVE-2013-2104 Auth token replay attack python-keystoneclient
CVE-2013-2013 Direct data leak python-keystoneclient
CVE-2013-1909 MITM Apache QPid Proton
CVE-2013-1665 Web attack xml
CVE-2013-1664 DoS xml
CVE-2013-1445 Weak crypto PyCrypto
CVE-2013-1068 Authentication bypass python-nova, python-cinder
CVE-2012-5825 MITM Tweepy
CVE-2012-5822 MITM zamboni
CVE-2012-5563 Authentication bypass python-keystoneclient
CVE-2012-4571 Weak crypto python-keyring
CVE-2012-4520 Web attack Django
CVE-2012-4406 Arbitrary code execution python-swiftclient
CVE-2012-3533 MITM ovirt-engine-python-sdk
CVE-2012-3458 Weak crypto Beaker
CVE-2012-3444 DoS Django
CVE-2012-3443 DoS Django
CVE-2012-2921 DoS python-feedparser
CVE-2012-2417 Weak crypto PyCrypto
CVE-2012-2374 Web attack tornado
CVE-2012-2146 Weak crypto elixir
CVE-2012-1575 Web attack Cumin
CVE-2012-1502 Arbitrary code execution PyPAM
CVE-2012-1176 DoS PyFriBidi
CVE-2012-0878 Authentication bypass python-paste-script
Table 8: Reported Python library vulnerabilities between Feb 2012 and June 2019.