ferify: A Virtual Machine File Protection System against Zero-Day Attacks

04/19/2020 ∙ by Alexis Peppas, et al. ∙ Naval Postgraduate School Microsoft 0

Most existing solutions for protecting VMs assume known attack patterns or signatures and focus on detecting malicious manipulations of system files and kernel level memory structures. In this research we develop a system called ferify, which leverages VM introspection (VMI) to protect user files hosted on a VM against unauthorized access even after an attacker has managed to obtain root privileges on the VM. ferify maintains in the hypervisor domain a shadow file access control list (SACL) that is totally transparent to the VM. It uses the SACL to perform independent access control on all system calls that may operate on the target files. Further, ferify prevents kernel modification, ensures the integrity of process ownership, and supports hypervisor based user authentication. We have developed a ferify prototype for Linux and through a set of controlled experiments we show that the system is able to mitigate a range of zero-day attacks that otherwise may evade signature-based solutions. In addition, we analyze the root cause of the observed high processing overhead from trapping of system calls, and propose a general solution that can potentially cut that overhead by half.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

A successful attack on a computer system typically results in the attacker obtaining the root privilege for the system. If the computer system is a virtual machine (VM), this means that the attacker has total access to all files hosted on the VM. How to detect and contain this type of root-kit attacks remains an important security problem.

While virtualization brings about new security challenges specific to VM operation, it also offers new solution approaches. The research community has long recognized the unique vantage point provided by the hypervisor for VM monitoring and malware detection. In particular, the VM introspection (VMI) capabilities [16, 11] have shown great promise. We observe two main advantages by deploying security solutions on the hypervisor. First, the code base is relatively small due to its narrow focus and thus, is relatively easy to catch software bugs or presence of malware. Second, being part of the critical path for accessing physical resources, the hypervisor is able to exert independent and process/thread level control over program executions on a guest VM. And this control can be dynamic, e.g., revoking a user’s permission to access certain resources without rebooting the VM.

Paladin [1] is among the first systems that leverage VMI to detect malicious manipulations of system files and/or run-time data structures and contain such attacks by aborting the offending processes and rolling back suspicious data modifications. However, this system requires a trusted software helper module running in the guest VM [1]. More importantly, Paladin as well as most other existing solutions for protecting VMs focus on protecting systems files and other data against known attack signatures. Thus, these solutions may have limited power against zero-day attacks.

In this paper, we present the design and evaluation of a user centric VM file protection system which we call ferify. As an overarching goal, we seek to protect selected user files hosted on a VM against unauthorized access in spite of a successful attack on the VM. ferify is implemented as a DRAKVUF [11] plugin that is completely independent from the guest VM. It maintains a separate file access control list (ACL) in the hypervisor and performs access control on all system calls that may operate on the target files. Additionally, we constrain kernel modifications and ensure the integrity of process ownership and other critical data in order to mitigate a range of zero-day attacks.

The rest of the paper is organized as follows. We review related work and provide background information in Section II. We present the design and implementation details of ferify in Sections III and IV, respectively. A detailed evaluation of the system’s file protection capabilities and its processing overhead is provided in Section V, followed by a detailed analysis of the overhead along with a solution for reducing it in Section VI. Finally, we discuss possible extensions and its current limitations in Section VII and then offer some concluding remarks in Section VIII.

Ii Related Work

Kernel Protection File Protection
Solution In-VM Out-VM Detection Prevention Detection Prevention
Virtuoso [6], SIM [21] - - - -
Haven [2], Overshadow [4],
InkTag [9], Lares [16], SHype [19] - - - -
Crawford and Peterson [5], ReVirt [7],
VMI [8], VMWatcher [10], - - - -
Macko et al. [12], PoKeR [18],
Strider, Ghostbuster [25]
SecVisor [20], Srinivasan et al. [22] - - - -
Sentry [23], HUKO [27]
Nasab [13] - - -
Paladin [1] - -
TABLE I: Overview of Existing solutions.

In this section, we first review existing security solutions that are most related to this work and then provide a short description of the DRAKVUF and LibVMI software, upon which ferify has been built.

VM monitoring and protection: Table I presents a summary of seventeen relevant solutions found in the literature. Some are integrated into the VM (i.e., “in-VM”) while others leverage the hypervisor (i.e.,“Out-VM”). Most focus on ensuring kernel integrity and protecting kernel level data structures such as the process control block. Only two solutions [1, 13] provide some level of file protection; however, they require known attack signatures.

DRAKVUF / LibVMI: LibVMI [15] is a C library developed by the Sandia Labs to simplify the development of VM introspection solutions for the Xen hypervisor. DRAKVUF [11] is a malware analysis tool built upon LibVMI. It supports in-depth execution tracing of arbitrary binaries (including kernel processes) that is totally transparent to the VM being monitored. Furthermore, it is designed to be extensible by supporting plugins.

Iii Design

In this section, we first describe the threat model and high level requirements and then present the details of how ferify traps system calls and is able to continue file access control even after the root account of the target VM has been compromised.

Iii-a Threat Model and Requirements

We have designed ferify by assuming this threat model:

  • [leftmargin=0.14in]

  • The hypervisor is considered secure; how to protect the hypervisor is outside the scope of this research.

  • The protected files are only remotely accessible and protected by a public-key authentication and encryption scheme (i.e., SSH).

  • The private keys of authorized users are secure, while the attacker may have obtained root privileges on the VM through a successful attack.

Additionally, we have set these high level design objectives:

  • [leftmargin=0.14in]

  • The solution must be completely out-VM to avoid subversion from the VM side.

  • The protected VM must remain usable by authorized users. In particular, the extra processing overhead incurred by the solution must be bounded to a tolerable range.

Iii-B Basic Functionality

The high level design of ferify for Linux and Xen is illustrated in Fig. 1. In a nutshell, the system traps all relevant system calls and uses a shadow access control (SACL) maintained inside the hypervisor to perform file access control. ferify identifies authorized user accounts by their uid and gid in the SACL; therefore, it includes additional security measures to ensure the integrity of (i) user and group identification, (ii) process ownership, and (iii) the kernel, as well as two further extensions to enhance security, which will be presented in the succeeding sub-section C.

Vulnerable VM

Protection Zone

DRAKVUF /LibVMI

ferify plugin

Attacker

File or kernel operation

Authorized user

Trappedsys-calls

No

STOP

Yes

HDD

Otherfiles

Protectedfiles

SACL
Fig. 1: Design overview

Trapping of System Calls:  We use DRAKVUF [11] to trap a specific set of system calls that are relevant to file operations and kernel access, as listed in Table II. A trap in our case is a software breakpoint (opcode 0xCC or the so-called INT 3 instruction for an Intel x86 CPU) injected at the beginning of all trapped system calls. The behavior can be compared to that of a debugger. LibVMI allows for the assignment of a callback function that gets executed when a trap gets caught. ferify provides a callback function that checks the validity of each trapped system call.

open() openat()
name_to_handle_at()* open_by_handle_at()*
rename() renameat()
renameat2() unlink()
unlinkat() truncate()
link() linkat()
symlink() symlinkat()
execve() execveat()
exit() exit_group()
init_module()* finit_module()*
kexec_load()*
TABLE II: Trapped system calls

Specifically, the callback function retrieves the arguments of the system call from the CPU registers according to the 64-bit Linux system call convention and then performs all the necessary checks against the SACL. The format of the SACL is shown in Fig. 2. The “Permission” field is set following the Linux three-octet file permission convention (ugo), referring to the permission for the user owning the file, users of the same group, and all other users, respectively. The three bits of each octet represent the read, write, and execute permission flag, respectively. Therefore, for the example in Fig. 2, 644 means that the owner of the file can read and write to the file while all other users including those of the same group have only read access. 400 means that the owner of the file can read the file, while no other users have access to the file. The “User” and “Group” fields provide the uid and gid of the file owner, respectively, with a special value of 0 referring to the root user or the root user group. Finally, if the file in question does not have an entry in the SACL, ferify deems it noncritical and the check successful.

Full path file or directory name Permission User Group
/home/user/Documents/critical.txt 644 1000 1000
/home/user/Desktop/read-only.pdf 400 1000 1000
/etc/shadow 220 0 0
Fig. 2: Example SACL entries.

The system call is allowed to proceed only if it passes the SACL check. When the decision is to deny the system call, ferify replaces the pointer that holds the file-name with a NULL pointer; doing so allows the system call to proceed but eventually fail when it tries to de-reference a NULL pointer.

We have successfully trapped and processed all system calls listed in Table II, albeit we currently simply deny those system calls marked with an asterisk due to time constraints for carrying out the implementation.

User & Group Integrity:  ferify prohibits switching of user accounts through the su command, by denying all users including the root the write permission to the /etc/pam.d/su file. Similarly, it enforces no write to password files /etc/passwd and /etc/shadow by adding deny entries in the SACL. This may raise usability challenges for authorized users. However, we envision that the VM will be periodically taken offline for maintenance and in these offline periods, these account restrictions can be lifted.

Process Ownership Integrity:  For each running user process, ferify keeps track of the uid and gid of its creator. This tracking allows for monitoring of malicious attempts to change this ownership information. Specifically, ferify traps the ret_from_fork kernel symbol and store in the hypervisor a hashtable of the owner information for all legitimate processes created by authorized users. (It also traps clone() to monitor all new processes.) This hashtable is then used to determine if a trapped system call is indeed made by an authorized user111If there is a legitimate change in the ownership of a process (through sudo) we update the stored information. This update is important in order to retain system usability. before the SACL check.

Kernel Integrity:

  By default, a (malicious) root user can modify kernel structure data and load new kernel modules. To mitigate this attack vector, ferify traps and blocks the init_mod() and finit_mod() system calls. It also write-protects /etc/modules, and possibly additional files and folders depending on the Linux distribution, to prevent the loading of new kernel modules during boot time. Doing so clearly imposes some usability issues. As discussed earlier, we envision to perform the legitimate kernel modifications in specific maintenance periods when the VM is taken offline and after properly authenticating and checking of the integrity of the new kernel modules.

To protect against malicious kernel swapping, ferify blocks the system call kexec_load(), which loads a new kernel for later execution. This introduces a usability limitation, by not allowing automatic kernel updates, some of which are necessary to fix kernel bugs. Since this operation can be performed in a more controlled environment (i.e., during offline periods) at the administrator’s discretion, we expect this to be a reasonable limitation.

Iii-C Two Extensions

To further enhance security, we extend ferify in two aspects.

2-Step Authentication:  We have integrated into ferify a two-step authentication mechanism leveraging the hypervisor as a user authenticating agent. The authentication is based on a pre-configured shared secret between each authorized user and the hypervisor, and performed as part of the callback function for the open() system call. Specifically, when this option is turned on for an authorized user, even after the user has successfully logged into the VM, ferify still considers the user “unauthenticated” until he/she passes a second form of authentication as follows. The user must prove that he/she possesses the shared secret by presenting a challenge response pair of strings to ferify, the latter of which is the SHA512 hash of the former concatenated with the shared secret, through an open() system call. For example, the user may trigger this authentication through the touch command and encode the challenge response strings in the file-name argument along with an artificial path-name to avoid collision with a real file.

After ferify verifies that the challenge response strings are valid, it changes the user’s status to “authenticated” for a predetermined time-frame, during which the user’s file access permission will be according to the SACL; otherwise, the user remains in the “unauthenticated” status and will not be granted access to any file specified in the SACL.

Program Execution White-Listing:  We have added an option to the callback functions of the execve() and execve_at() system calls for ferify to deny, instead of permit by default, execution of a file that is not listed in the SACL. In other words, when this option is turned on, a file can be executed only if it has a permit entry in the SACL. Effectively, this option makes the SACL a white-list for program execution. Again, there is a trade-off of usability with this option, but we believe this option is useful for certain deployment scenarios.

Iv Implementation

We have created a prototype implementation of ferify for Linux, as a plugin for the v0.5-655884f version of DRAKVUF, which is bundled with the 0.12 release of LibVMI and the 4.8.1 version of Xen. The implementation consists of about 2,400 lines of C code222We will make all source code publicly available once the double blind requirement for this paper is lifted.. The SACL is implemented as a hashtable to bound the search processing overhead. To assist testing with different SACL sizes, we have also created a script that retrieves information for all files currently residing in a given VM and creates an artificial full SACL with a “permit” entry for each of the files.

We have chosen a computer with an Intel i7-6700 CPU as the test platform, given that DRAKVUF [11] is designed to take advantage of hardware virtualization extensions found in Intel CPUs. The computer is equipped with 8 GB of RAM. It runs the Ubuntu 16.04 64-bit version of Linux, more specifically, the 4.10.0-30-generic kernel, as the host operating system (OS) for Dom0. The guest VM also runs Ubuntu 16.04 64-bit, but with the 4.8.0-54-generic kernel.

V Evaluation

Our evaluation of ferify consists of two parts. First, we validate its ability to protect files against unauthorized access, particularly its potential to mitigate zero-day attacks. Second, as ferify needs to trap more than a dozen system calls, we quantify expected performance degradation upon authorized users due to the extra processing overhead it introduces.

V-a Validation of File Protection

Ideally, we can enumerate the exact range of attacks that ferify is able to mitigate and perform experiments to confirm the effectiveness. However, given the unpredictable nature of zero-day attacks, and the amount of effort required to hypothesize and enact the likely large number of attacks to ensure coverage, we take an alternative, more practical, approach, whereby we show the main design features as presented in Section III achieve their objectives. More specifically, we consider two types of access to protected files hosted on the guest VM: one by an authorized user and the other by an attacker, as illustrated in Fig. 1. In line with our threat model, we assume that (i) the authorized user’s private key is secure and (ii) the attacker has gained root privileges on the guest VM through a compromise originating from a different user ID than the legitimate user.

To validate ferify’s basic functionality, we performed specific file operations to emulate the actions of the attacker. We try these operations with different permissions defined in the SACL each time, to verify that the permissions we set are actually enforced. The SACL we used contains all files of the VM system.333The VM was a fresh installation of Linux with only a few essential packages added, such as gcc, rekall, and openssh-server. The SACL for this setup contained more than 400,000 entries. The results of the tests show that we can successfully deny unauthorized access to protected files, even if the OS of the VM would allow it given that the attacker has gained root privileges.

User & Group Integrity:  As our tests reveal, we can limit access to files on a per user basis. This constraint applies equally to all users, including the root user. Fig. 3 shows the SACL implementation for the /etc/shadow and /etc/pam.d/su files, and Fig. 4 shows the result of the root user trying to change another user’s password, after the /etc/shadow file has been write protected in the SACL.

Full Path File Name Permissions User Group
/etc/shadow 400 0 0
/etc/pam.d/su 000 0 0
Fig. 3: SACL entries preventing root user from overwriting two system files.

root@HVM-domU:#̃ passwd alice
Enter new UNIX password:
Retype new UNIX password:
passwd: Authentication token manipulation error
passwd: password unchanged
root@HVM-domU:#̃

Fig. 4: Confirmation of denial of password change from root account.

In addition, we validated that ferify is able to prevent escalation of user’s privilege to root through the sudo command, as illustrated by Fig. 5. For brevity we omit the actual SACL entries for implementing this policy.

user@HVM-domU:$̃ sudo ls
sudo: unable to open /etc/sudoers: Bad address
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin
user@HVM-domU:$̃

Fig. 5: Confirmation of failure of sudo.

From these experimental results, we extrapolate that ferify can enforce some files to be immutable, by removing the write permissions from all users, including the root.444Paladin [1] can also achieve this effect. Additionally, by removing only the read permissions on system log files, we allow for normal logging operation, but an attacker cannot read the log-file and select which entries to delete, to hide any malicious activity, making the “hiding of tracks” harder for the attacker.

Process Ownership Integrity:  From the literature, to the best of our knowledge, the only effective way to alter a process’s ownership information appears to be through a kernel module. This is because that information is stored in the task_struct data structure, which is part of the kernel memory space. To directly access and manipulate kernel memory variables the attacker must be able to run code as part of the kernel, i.e., using a kernel module. Therefore, the validation results for the Kernel Integrity part (which we will present next) also applies to this functionality.

Kernel Integrity:  Since the kernel protection mechanism consists simply of denying addition of kernel module when the VM is online, the validation was relatively straightforward. Fig. 6 shows the test result of trying to load a new kernel module.

root@HVM-domU:#̃ insmod my_module.ko
insmod: ERROR: could not insert module        my_module.ko: Bad file descriptor
root@HVM-domU:#̃

Fig. 6: Confirmation of denying addition of new kernel module.

2-step Authentication:  In this test, we turned on the option for the authorized user. Therefore, the user was considered “unauthenticated” even after it had passed the user authentication by the VM and started an ssh session. Therefore as Fig. 7 shows, the user couldn’t access test1.txt even though the SACL contained a permit entry for the user regarding this particular file.

user@HVM-domU:#̃ echo hi ¿ test1.txt
-bash: echo write error: Bad file descriptor
root@HVM-domU:#̃

Fig. 7: Confirmation of failed file creation before second authentication.

At the next step, as Fig. 8 shows, we ran the touch command to perform the second authentication by providing a valid pair of challenge response strings based on a shared secret pre-configured in ferify. It should be noted that even though the authentication was successful, an error message was outputted because of the artificial file name string.

user@HVM-domU:#̃ touch /tokens/1110d209df92a6f603f89
d18e2b79dda732fe88c2fcf2347024d1b4244e1d0013723107b
419e6fe7d6d0dd80b4a45d06d02271473dce873477528a67f4b
b2312267
touch: cannot touch ’/tokens/1110d209df92a6f603f89d
18e2b79dda732fe88c2fcf2347024d1b4244e1d0013723107b4
19e6fe7d6d0dd80b4a45d06d02271473dce873477528a67f4bb
2312267’: No such file or directory
root@HVM-domU:#̃

Fig. 8: Illustration of second authentication using touch.

After the successful 2-step authentication, as Fig. 9 shows, the user was able to create test1.txt as expected.

user@HVM-domU:#̃ echo hi ¿ test1.txt
root@HVM-domU:#̃

Fig. 9: Confirmation of file creation after successful 2-step authentication.

Program White-Listing:  We turned on this option. We created a test program that simply prints out the string: “Can run this program.” We added a permit entry for this program in the SACL. We ran the program and then ran an exact copy of it, newfile, which was not added to the SACL. As Fig. 10 shows, the first execution was successful because test was in the white-list but the second execution failed because newfile was not white-listed.

user@HVM-domU:#̃ ./test
Can run this program.
user@HVM-domU:#̃ cp test newfile
user@HVM-domU:#̃ ./newfile
-bash: ./newfile: Bad address
user@HVM-domU:#̃

Fig. 10: Confirmation of program white-listing.

V-B Quantification of Processing Overhead

To quantify the processing overhead incurred by ferify upon an authorized user, we have selected three of the most occurring system calls it traps – namely, open() (for read & write), rename() (for moving files), and unlink() (for deleting files) – and benchmarked their performance in each of these three scenarios:

  1. ferify is not deployed, i.e., no system call is actually trapped.

  2. ferify is deployed, but with an empty SACL; in this case, there is no need to search the SACL for a specific file.

  3. ferify is deployed with a full SACL, i.e., with an entry for each file in the VM. The SACL contains more than 200,000 entries.

In addition, we examine whether it is possible to increase the performance of ferify by adjusting its scheduling priority. We consider three cases: (i) no adjustment; (ii) the processing priority of ferify is maximized using the nice command; and (iii) both the processing and I/O priorities are maximized using the nice and ionice commands, respectively.

The bench-marking of each system call is repeated 20 times for each scenario and scheduling priority combination. Their average processing times (in ms) are reported in Table III.

First, we observe that ferify’s overall processing overhead per system call is in the millisecond range, which is usable for most applications, while the overhead is significant as the processing times jumped by more than one order of magnitude with ferify’s deployment. Second, somewhat surprisingly, we observe that adjusting scheduling priorities had little effect on the processing times. Lastly and importantly, we see that there is little change of performance from an empty to full SACL, which indicates that the SACL look-up and permission checking actions incurred a very small portion of the overall overhead. In other words, the time spent by other actions, mainly performed by the core DRAKVUF/LibVMI code for trapping the system calls, might dominate the ferify processing overhead.

Ratio of increase
Without ferify With ferify With ferify
System call Avg time (msec) empty SACL full SACL
No scheduler priority set
open() 0.273 5.167 6.385
rename() 0.119 9.891 15.326
unlink() 0.110 11.461 14.732
With best nice value
open() 0.273 6.623 6.389
rename() 0.119 12.686 14.928
unlink() 0.110 14.643 14.426
With best nice and ionice values
open() 0.273 6.461 7.713
rename() 0.119 13.000 18.238
unlink() 0.110 14.926 17.384
TABLE III: ferify processing overhead measurements

To further explore this hypothesis, we have conducted additional experiments with ferify deployed using default scheduling priorities, while varying the SACL size four times, to 100, 1000, 10000, and 100000 entries, respectively. The results are plotted in Fig. 11. We observe that the overall processing times for each of the three systems call did not change much as the SACL size increases. This is not surprising given that the SACL has been implemented as a hashtable and confirms that SACL processing incurs a very small portion of the overall overhead.

Fig. 11: Little changes in ave. system call processing times for 4 SACL sizes.

Vi Understanding Overhead of VM Introspection

DRAKVUF and LibVMI software hides almost all the system call trapping details from a plugin. To explain why ferify introduces significant processing overhead as presented in Table III, we have to dig deep into the DRAKVUF and LibVMI source code. After a careful analysis, we discover that the primary contributor to the processing overhead are the multiple CPU context-switches between the hypervisor and the guest VM. The details of our analysis are presented in this section.

Vi-a The Tale of Four Context-Switches

DRAKVUF traps a system call in the same way as a debugger like gdb does. Specifically, DRAKVUF replaces the instruction stored in the memory location indicated, with the software breakpoint signal (i.e., opcode 0xCC). Therefore, when the instruction pointer (I.P.) of the VM reaches the trapped address, the CPU will raise a software interrupt. This event, by the way in which Xen handles such interrupts, causes a context switch from the VM to the hypervisor. This context switch is commonly called “VM-exit” in the Xen literature.

Right after this context switch, Xen’s master interrupt handler checks for a registered process in Dom0 to handle the interrupt. If no such process exists the hypervisor propagates the interrupt to the VM, to be handled by the guest OS. Therefore when ferify is running and registered for the interrupt, the master handler will select it to respond to the interrupt.

As soon as ferify completes its checking of file access permissions and clears the relevant register(s) when the decision is to deny the access, DRAKVUF must, again like a debugger, replace the breakpoint signal with the original instruction to allow the system call to proceed. Furthermore, DRAKVUF must also inject the breakpoint signal back after execution of only one instruction in order to reset the trap and not miss any future instance of the system call. Currently, there is only one known way to meet the stringent timing requirement for resetting the trap; that is, to put the original process in the single-step execution mode. DRAKVUF accomplishes this by setting a CPU register (flag), just as a debugger would do, before resuming the trapped process. Note that another context switch happens as a result, commonly called “VM-entry”. Then, after execution of the first instruction, the CPU will raise an exception due to the single-stepping mode. This will be caught by the hypervisor and its interrupt handler, causing a second VM-exit, as illustrated in the figure below. At this point DRAKVUF knows that one instruction has been executed and it is safe to re-inject the breakpoint. After doing so, it clears the CPU flag for single-stepping and returns the trapped process to the normal execution mode, which causes a second VM-entry.

System Call Code Block                   Event

      0x…: .. The first instruction of
system call () is replaced
by software breakpoint.
This needs to happen
to trap the system call

0xCC

      0x…: CC .. System call is just invoked.
A hypervisor-VM context
switch occurs due to software
interrupt.
VM-exit  #1

I.P

      0x…: .. The breakpoint is replaced
by the original instruction
in order to resume VM
execution after
single-step is enabled.
VM-entry  #1

I.P

      0x…: .. The execution of
  causes a new trap
due to single-step mode.
VM-exit  #2

I.P

      0x…: CC   .. The first instruction is
replaced by interrupt
and single-step is disabled
before resuming VM.
VM-entry  #2

I.P
Fig. 12: Illustration of context-switch events for trapping a system call

It is well known that a context switch (VM-exit or VM-entry) incurs a significant amount of overhead due to the need for saving the complete state of the VM or hypervisor into memory prior to performing the switch. Requiring four context switches for trapping one system call, the current DRAKVUF and LibVMI implementation is unsurprisingly the primary source of ferify processing overhead. More importantly, this overhead is a more general problem, impacting all systems that use DRAKVUF and LibVMI to trap processes running on guest VMs. Therefore, we have investigated ways to reduce the number of context switches for trapping system calls in a VM, leading to one possible solution, which we will present in the next section.

Vi-B A Proposal for Reducing VMI Overhead

The question of how to minimize context switches between a guest VM and the hypervisor in order to improve system performance has been studied in other settings [24, 26]. In this section, we propose a solution that can cut the number of context switches from four to two for a hypervisor based VM introspection (VMI) system such as ferify to trap a system call from a VM.

Specifically, we observe that the second pair of VM-exit and VM-entry context switches, as illustrated in Fig. 12, is necessary only because the first instruction of the system call code needs be replaced by the software interrupt signal immediately after each execution in order to not miss any future invocation of the system call. In other words, the current DRAKVUF/LibVMI implementation requires the first instruction to toggle between two opcodes with a timing requirement that can only be met by putting the calling process into the single step mode. However, if the first instruction of the system call were the “do nothing” (NOP) (i.e., opcode 0x90 for an Intel x86 CPU), DRAKVUF would be able to replace it with the software interrupt signal permanently to trap the system call over and over again, with only two context switches at each time. To avoid an infinite loop, the I.P. of the VM’s CPU will be incremented to point to the next valid instruction.

Therefore, we propose to prepend the code of each trapped system call with the NOP instruction, resulting in one pair of context switches for each trapping, as illustrated in Fig. 13. There are several ways of accomplishing the prepending. One may revise the source code for the object code of the system calls. For a relatively less intrusive approach, we suggest to add such an option to C compilers.

System Call Code Block                   Event

      0x…: 90   .. 1st instruction of system
call (i.e., NOP) is replaced
by software breakpoint.
This needs to happen
to trap the system call.

0xCC

      0x…: CC   .. System call is just invoked.
A VM-hypervisor context
switch occurs due to
software interrupt.
VM-exit

I.P

      0x…: CC   .. A hypervisor-VM context
switch occurs to resume
VM execution. The instruction
pointer moves forward.
VM-entry

I.P
Fig. 13: Proposed solution incurs only two context switches

Vii Discussion

In this section we discuss possible extensions of ferify and its current limitations.

Vii-a Potential Extensions

ferify logically should eventually be deployable to desktops, cloud data centers, mobile devices555LibVMI already supports ARM Cortex-A15 architectures and IoT systems, due to its relative ease of deployment, for not requiring any kernel modification to the VMs. The ability to perform independent access control of individual files makes ferify a unique building block for creating additional VM security solutions. For brevity, we describe two of such extensions as follows.

First, we envision to extend ferify to generalize the concept of immutable file and “lock down” parts of a running VM to maximize the availability and integrity of certain system services (e.g., networking) as well as user applications (e.g., database and web severs) during their deployment. In other words, we are interested in elevating the abstraction of protection from files to services and applications.

Second, we are intrigued by the possibility of using ferify to obfuscate an application work-flow to further enhance data protection for a VM. For example, a database operator may grant write permissions only to a randomly generated sequence of uids through ferify and modify the querying process to require forking an ephemeral thread with the correct next uid in that sequence in order to write into the database. This is possible because ferify supports sharing of secrets, in this case a uid sequence, with authorized users or processes in a manner transparent to the VM.

Vii-B Limitations

While ferify is able to detect and prevent a range of zero-day attacks, the protection is only limited to specific files defined in the SACL. Also, the root security mechanism of ferify is dependent on the system calls of the kernel and the system is therefore dependent on the kernel not having malware placed in the kernel during start-up. We decided not to expand further on kernel protection during boot-time; there are already solutions, like vTPM [17], which implement a more sophisticated way of securing the system’s boot procedure and verifying that the code launched by firmware is trusted.

Additionally, ferify is currently limited in support of multi-thread processes and multi-core systems. To support multi-threaded applications, more work is necessary to examine the Linux kernel’s task_struct in order to identify where the required thread information is stored and determine how to ensure their integrity. For multi-core systems, although the way how we have utilized DRAKVUF’s capabilities should allow direct implementation without further modification, additional testing is required to confirm this is the case.

Viii Conclusion

In this research we developed and evaluated ferify, an out-of-guest file protection system capable of detecting and even preventing a range of zero-day attacks against a specific VM. By enforcing totally independent file access control policy, supporting hypervisor based user authentication, and requiring no footprint in the VM, the system has unique advantages over existing security solutions based on known attack signatures. In addition, we performed an investigation into the observed high processing overhead from trapping of system calls, one of key features enabled by current VMI introspection software and used by ferify, which led to a general solution that could potentially cut that overhead by half. Finally, we observe that ferify only scratches the surface of what is possible of leveraging the hypervisor to achieve fine grain user data protection on an VM, and the topic is increasingly fundamental given the seemingly inevitable technology transition towards cloud and fog computing.

References

  • [1] A. Baliga, L. Iftode, and X. Chen, “Automated containment of rootkits attacks,” Computers & Security, vol. 27, no. 7, pp. 323–334, 2008.
  • [2] A. Bauman, M. Peinado, and G. Hunt, “Shielding applications from an untrusted cloud with Haven,” ACM Transactions on Computer Systems (TOCS), vol. 33, no. 3, 2015.
  • [3] E. Bauman, A. Gbadebo, and L. Zhiqiang, “A survey on hypervisor-based monitoring: Approaches, applications, and evolutions” ACM Computing Surveys (CSUR), vol. 48, no. 1, 2015.
  • [4] X. Chen et al. “Overshadow: a virtualization-based approach to retrofitting protection in commodity operating systems,” In ACM SIGARCH Computer Architecture News, vol. 36, pp. 2–13, 2008.
  • [5] M. Crawford, and G. Peterson, “Insider threat detection using virtual machine introspection,” In Proc. IEEE Hawaii International Conference on System Sciences, pp. 1821–1830, 2013.
  • [6] B. Dolan-Gavitt, T. Leek, M. Zhivich, J. Giffin, and W. Lee, “Virtuoso: Narrowing the semantic gap in virtual machine introspection,” In Proc. IEEE Symposium on Security and Privacy, pp. 297–312, 2011.
  • [7] G.W. Dunlap, et al. “Revirt: Enabling intrusion analysis through virtual machine logging and replay,” In ACM SIGOPS Operating Systems Review, pp. 211–224, 2002.
  • [8] T. Garfinkel, and M. Rosenblum, “A virtual machine introspection based architecture for intrusion detection,” In Proc. NDSS, pp. 191–206, 2003.
  • [9] O.S. Hofman, et al. “Inktag: Secure applications on an untrusted operating system,” In ACM SIGARCH Computer Architecture News, pp. 265–-278, 2013.
  • [10] X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection through VMM-based out-of-the-box semantic view reconstruction,” In Proc. ACM CCS, pp. 128–-138, 2007.
  • [11] T. K. Lengyel, et al. “Scalability, fidelity and stealth in the DRAKVUF dynamic malware analysis system,” In Proc. Annual Computer Security Applications Conference, 2014.
  • [12] P. Macko, M. Chiarini, and M. Seltzer, “Collecting provenance via the Xen hypervisor,” In Proc. USENIX Workshop on the Theory and Practice of Provenance, 2011.
  • [13] M. R. Nasab, “Security functions for virtual machines via introspection,” Master’s Thesis, Chalmers University of Technology, 2012
  • [14] D. Ott, “Virtualization and Performance: Understanding VM Exits”. Available online: https://software.intel.com/en-us/blogs/2009/06/25/virtualization-and-performance-understanding-vm-exits [Last accessed on October 28, 2017].
  • [15] B.D. Payne, “Simplifying Virtual Machine Introspection Using LibVMI,” Sandia Labs Tech. Report, SAND2012-7818, September 2012.
  • [16] B.D. Payne, M. Carbone, M. Sharif, and W. Lee, “Lares: An architecture for secure active monitoring using virtualization,” In Proc. IEEE Symposium on Security and Privacy, pp. 233–247, 2008.
  • [17] R. Perez, R. Sailer, and L. van Doorn, ”vTPM: virtualizing the trusted platform module,” In Proc. 15th Conf. on USENIX Security Symposium, pp. 305-320. 2006.
  • [18] R. Riley, X. Jiang, and D. Xu, “Multi-aspect profiling of kernel rootkit behavior,” In Proc. ACM European conference on Computer systems, pp. 47–-60, 2009.
  • [19]

    R. Sailer et al. ”Building a MAC-based security architecture for the Xen open-source hypervisor,” In Proc. Annual Computer Security Applications Conference, 2005.

  • [20] A. Seshardi, M. Luk, N. Qu, and A. Perrig, “Secvisor: A tiny hypervisor to provide lifetime kernel code integrity for commodity OSes,” In ACM SIGOPS Operating Systems Review, vol. 41, pp. 335–350, 2007.
  • [21] M.I. Sharif, W. Lee, W. Cui, and A. Lanzi, “Secure in-VM monitoring using hardware virtualization,” In Proc. ACM CCS, pp.  477–487, 2009.
  • [22] D. Srinivasan, Z. Wang, X. Jiang, and D. Xu, “Process out-grafting: an efficient out-of-VM approach for fine-grained process execution monitoring,” In Proc. ACM CCS, pp. 363–374, 2011.
  • [23] A. Srivastana, and J. Giffin, “Efficient protection of kernel data structures via object partitioning,” In Proc. Annual Computer Security Applications Conference, pp. 429–438, 2012.
  • [24] X. Wang, et al. “Detecting and Analyzing VM-exits,” In Proc. IEEE International Conference on Computer and Information Technology, 2010.
  • [25] Y.-M. Wang, D. Beck, B. Vo, R. Roussev, and C. Verbowski, “Detecting stealth software with Strider Ghostbuster,” In Proc. IEEE International Conference on Dependable Systems and Networks, pp. 368–377, 2005.
  • [26] S. Xi, J. Wilson, C. Lu, and C. Gill, “RT-Xen: towards real-time hypervisor scheduling in Xen,” In Proc. ACM International conference on Embedded software, 2011.
  • [27] X. Xiong, D. Tian, and P. Liu, “Practical protection of kernel integrity for commodity OS from untrusted extensions,” In Proc. NDSS, 2011.