A separation kernel (SK) is a small specialized operating system or microkernel, that provides a sand-boxed or “separate” execution environment for a given set of processes (also called “partitions” or “subjects”). The subjects may communicate only via declared memory channels, and are otherwise isolated from each other. Unlike a general operating system these kernels usually have a fixed set of subjects to run according to a specific schedule on the different CPUs of a processor-based system.
The idea of a separation kernel was proposed by Rushby  as a way of breaking down the process of reasoning about the overall security of a computer system. The overall security of a system, in his view, derives from (a) the physical separation of its components, and (b) the specific security properties of its individual components or subjects. A separation kernel thus focuses on providing the separation property (a) above. Separation kernels have since been embraced extensively in military and aerospace domains, for building security and safety-critical applications.
Our focus in this paper is in formally verifying that a separation kernel does indeed provide the separation property, and more generally that it functions correctly (which would include for example, that it executes subjects according to a specified schedule). One way of obtaining a high level of assurance in the correct functioning of a system is to carry out a refinement-based proof of functional correctness [19, 18], as has been done in the context of OS verification [31, 22]. Here one specifies an abstract model of the system’s behaviour, and then shows that the system implementation conforms to the abstract specification. A refinement proof typically subsumes all the standard security properties related to separation, like no-exfiltration/infiltration and temporal and spatial separation of subjects considered for instance in .
Our specific aim in this paper is to formally verify the correctness of the Muen separation kernel 
, which is an open-source representative of a class of modern separation kernels (including several commercial products like GreenHills Integrity Multivisor, LynxSecure , PikeOS , VxWorks MILS platform , and XtratuM ) that use hardware virtualization support and are generative in nature. By the latter we mean that these tools take an input specification describing the subjects and the schedule of execution, and generate a tailor-made processor-based system that includes subject binaries, page tables, and a kernel that acts like a Virtual Machine Monitor (VMM).
When we took up this verification task over three years ago, a few challenges stood out. How does one reason about a system whose kernel makes use of virtualization features in the underlying hardware in addition to Assembly and a high-level language like Ada? Secondly, how does one reason about a complex 4-level paging structure and the translation function it induces? Finally, and most importantly, how do we reason about a generative system to show that for every possible input specification, it produces a correct artifact? A possible approach for the latter would be to verify the generator code, along the lines of the CompCert project . However with the generator code running close to 41K LOC, with little or no compositional structure, and not being designed for verification, this would be a formidable task. One could alternatively perform translation validation  of an input specification of interest, but this would require manual effort from scratch each time.
We overcame the first challenge of virtualization by simply choosing to model the virtualization layer (in this case Intel’s VT-x layer) along with the rest of the hardware components like registers and memory, programmatically in software. Thus we modeled VT-x components like the per-CPU VMX-Timer and EPTP as 64-bit variables in Ada, and implicit structures like the VMCS as a record with appropriate fields as specified by Intel . Instructions like VMLAUNCH were then implemented as methods that accessed these variables. As it turned out, virtualization was more of a boon than a bane, as it simplifies the kernel’s (and hence the prover’s) job of preemption and context-switching.
We solved the third problem of generativeness (and coincidentally the second problem of page tables too), by leveraging a key feature of such systems: the kernel is essentially a template which is largely fixed, independent of the input specification. The kernel accesses variables which represent input-specific details like subject details and the schedule, and these structures are generated by Muen based on the given input specification. The kernel can thus be viewed as a parametric program, much like a method that computes using its formal parameter variables. In fact, taking a step back, the whole processor system generated by Muen can be viewed as a parametric program with parameter values like the schedule, subject details, page tables, and memory elements being filled in by the generator based on the input specification.
This view suggests a novel two-step technique for verifying generative systems that can be represented as parametric programs. We call this approach conditional parametric refinement. We first perform a general verification step (independent of the input spec) to verify that the parametric program refines a parametric abstract specification, assuming certain natural conditions on the parameter values (for example injectivity of the page tables) that are to be filled in. This first step essentially tells us that for any input specification , if the parameters generated by the system generator satisfy the assumed conditions, then the generated system is correct vis-a-vis the abstract specification. In the second step, which is input-specific, we check that for a given input specification, the assumptions actually hold for the generated parameter values. This gives us an effective verification technique for verifying generative systems that lies somewhere between verifying the generator and translation validation.
We carried out the first step of this proof technique for Muen, using the Spark Ada  verification environment. The effort involved about 20K lines of source code and annotation. No major issues were found, modulo some subjective assumptions we discuss in Sec. 12. We have also implemented a tool that automatically and efficiently performs the Step 2 check for a given SK configuration. The tool is effective in proving the assumptions, leading to machine-checked proofs of correctness for 12 different input configurations, as well as in detecting issues like undeclared sharing of memory components in some seeded faulty configurations.
To summarize our contributions, we have proposed a novel approach based on parametric refinement for verifying generative systems. We have applied this technique to verify the Muen separation kernel, which is a complex low-level software system that makes use of hardware virtualization features. We believe that other verification efforts for similar generative systems can benefit from our approach. To the best of our knowledge our verification of Muen is the first work that models and reasons about Intel’s VT-x virtualization features in the context of separation kernels. Finally, we note that our verification of Muen is a post-facto effort, in that we verify an existing system which was not designed and developed hand-in-hand with the verification process.
2 Conditional Parametric Refinement
We begin by introducing the flavour of classical refinement that we will make use of, followed by the parametric refinement framework we employ for our verification task.
2.1 Machines and Refinement
A convenient way to reason about systems such as Muen is to view them as an Abstract Data Type or simply machine to use the Event-B terminology . A machine type contains a finite set of named operations , with each operation having an associated input domain and output domain . Each machine type contains a designated initialization operation called . A machine of type above is a structure of the form , where is a set of states, and for each , is the implementation of operation . If , then when is invoked with input in a state of the machine, it returns and changes the state to . The operation is assumed to be independent of the state in which it is invoked as well as the input passed to it. Hence we simply write instead of etc.
The machine induces a transition system in a natural way, whose states are the states of , and transitions from one state to another are labelled by triples of the form , representing that operation with input was invoked and the return value was . One is interested in the language of initialized sequences of operation calls produced by this transition system, which models behaviours of the system, and we call it .
We will consider different ways of representing machines, the most important of which is as a program in a high-level imperative programming language. The global variables of the program (more precisely valuations for them) make up the state of the machine. The implementation of an operation is given by a method definition of the same name, that takes an argument in , updates the global state, and returns a value in . We call such a program a machine program. Fig 1(a) shows a program in a C-like language, that represents a “set” machine with operations , and . The set stores a subset of the numbers 0–3, in a Boolean array of size 4. However, for certain extraneous reasons, it uses an array to permute the positions where information for an element is stored. Thus to indicate that is present in the set the bit is set to true. We use the notation “0..3” to denote range of integers from 0 to 3 inclusive.
Another representation of a machine could be in the form of a processor-based system. Here the state is given by the values of the processor’s registers and the contents of the memory. The operations (like “execute the next instruction on CPU0”, or “timer event on CPU1”) are defined by either the processor hardware (as in the former operation) or by software in the form of an interrupt handler (as in the latter operation).
Refinement [19, 18, 1] is a way of saying that a “concrete” machine conforms to an “abstract” one, behaviourally. In our setting of total and deterministic machines, refinement boils down to containment of sequential behaviours. Let and be two machines of type . We say refines if . One usually makes use of a “gluing” relation to exhibit refinement. A gluing relation on the states of and above is a relation . We say is adequate (to show that refines ) if it satisfies the following conditions:
(init) Let and . Then we require that . Thus, after the machines are initialized, their states must be -related.
(sim) Let , and , with . Let , , and suppose and . Then we must have and .
It is not difficult to see that existence of an adequate gluing invariant is sufficient for refinement.
When machines are presented in the form of programs, we can use Floyd-Hoare logic based code-level verification tools (like VCC  for C, or GNAT Pro  for Ada Spark), to phrase the refinement conditions as pre/post annotations and carry out a machine-checked proof of refinement . The basic idea is to combine both the abstract and concrete programs into a single “combined” program with separate state variables but joint methods that carry out the abstract and concrete operation one after the other. The gluing relation is specified as a predicate on the combined state. Fig. 1(b) shows an abstract specification and a gluing relation, for the set machine program of part (a). The refinement conditions (init) and (sim) are phrased as pre/post annotations on the joint operation methods, in the expected manner.
2.2 Generative Systems and Parametric Refinement
A generative system is a program that given an input specification (in some space of valid inputs), generates a machine program . As an example, one can think of a set machine generator , that given a number of type unsigned int (representing the universe size), generates a program similar to the one in Fig. 1(a), which uses the constant in place of the set size 4, and an array of size , which maps each in to .
For every , let us say we have an abstract machine (again similar to the one in Fig. 1(b)) say , describing the intended behaviour of the machine . Then the verification problem of interest to us, for the generative system , is to show that for each input specification , refines . This is illustrated in Fig. 3(a). We propose a way to address this problem using refinement of parametric programs, which we describe next.
A parametric program is like a standard program, except that it has certain read-only variables which are left uninitialized. These uninitialized variables act like “parameters” to the program. We denote by a parametric program with an uninitialized variable . As such a parametric program has no useful meaning (since the uninitialized variables may contain arbitrary values). But if we initialize the variable with a value passed to the program, we get a standard program which we denote by . Thus is obtained from by adding the definition to the declaration of . By convention we use uppercase names for parameter variables, and lowercase names for values passed to the program. Instead of a single parameter we allow a parametric program to have a list of parameters , and extend our notation in the expected way for such programs.
Let be a set of operation names. A parametric machine program of type is a parametric program containing a method for each operation . The input/output types of may be dependent on and derived from the parameter values. Given a parameter value for , we obtain the program which is a machine program. Each method now has a concrete input/output type which we denote by and respectively. is thus a machine program of type , and has a set of states that we denote by . We define the state space of , denoted , to be .
Fig. 2(a) shows an example parametric machine program , representing a parametric version of the set program in Fig. 1(a). Given a value 4 for and a list for , we get the machine program , which behaves similar to the one of Fig. 1(a). We note that the input type of the methods and depend on the value of the parameter .
Given two parametric machine programs and of type , we are interested in exhibiting a refinement relation between instances of and . Let be a relation on parameter values for and for , given by a predicate on the variables in and . We say that parametrically refines w.r.t. the condition , if whenever two parameter values for and for are such that holds, then refines .
We propose a way to exhibit such a conditional refinement, using a single “universal” gluing relation, as follows. Let , , and be as above. Let be a relation on the state spaces of and of , given by a predicate on the variables of and . We call a parametric gluing relation on and . We say is adequate, with respect to the condition , if the following conditions are satisfied. In the conditions below, we use the standard Hoare triple notation for total correctness , to mean that a program , when started in a state satisfying predicate , always terminates in a state satisfying . We use the superscript or to differentiate the components pertaining to the programs and respectively.
(type) For each : .
(sim) For each : .
In the program fragments in (init) and (sim) above we assume that the variable and type declarations are prefixed to the programs.
We can now state the following theorem:
Let and be parametric machine programs of type . Let be a predicate on and , and let be an adequate parametric gluing relation for and w.r.t. . Then parametrically refines w.r.t. the condition .
Let , , and be as in statement of the theorem, and let be an adequate gluing relation w.r.t. . Consider parameter values and satisfying . By the (type) condition we know that and are machines of the same type. Further, restricted to is an adequate gluing relation for the two machines since it can be seen to satisfy the (init) and (sim) conditions of Sec. 2. ∎
Consider the parametric machine program in Fig. 2(a), and the abstract parametric program in Fig. 2(b), which we call . Consider the condition which requires that and to be injective. Let be the parametric gluing predicate . Then can be seen to be adequate w.r.t. the condition , and thus parametrically refines w.r.t. .
Verifying Generative Systems using Parametric Refinement.
Returning to our problem of verifying a generative system, we show a way to solve this problem using the above framework of conditional parametric refinement. Consider a generative system that given an input specification , generates a machine program , and let be the abstract specification for input . Recall that our aim is to show that for each , refines . Suppose we can associate a parametric program with , such that for each , can be viewed as generating the value for the parameter , so that is behaviourally equivalent to . And similarly a parametric abstract specification , and a concrete value for each , such that is equivalent to . Further, suppose we can show that parametrically refines with respect to a condition on and . Then, for each such that and satisfy the condition , we can conclude that refines . This is illustrated in Fig. 3. If is a natural enough condition that a correctly functioning generator would satisfy, then this argument would cover all inputs .
As a final illustration in our running example, to verify the correctness of the set machine generator , we use the parametric programs and to capture the concrete program generated and the abstract specification respectively. We then show that parametrically refines w.r.t. the condition , using the gluing predicate , as described above. We note that the actual values generated for the parameters in this case (recall that these are values for the parameters , and ) do indeed satisfy the conditions required by , namely that and be equal, and be injective. Thus we can conclude that for each input universe size , the machine program refines , and we are done.
3 Intel x86/64 with VMX Support
In this section we give a high-level view of the x86/64 processor platform, on which the Muen SK runs. We abstract some components of the system to simplify our model of the processor system that the Muen toolchain generates. For a more complete description of the platform we refer the reader to the voluminous but excellent reference .
The lower part of Fig. 4 depicts the processor system and its components. The CPU components (with the 64-bit general purpose registers, including the instruction pointer, stack pointer, and control registers, as well as model-specific registers like the Time Stamp Counter (TSC)) and physical memory components are standard. The layer above the CPUs shows components like the VMCS pointer (VMPTR), the VMX-Timer, and extended page table pointer (EPTP), which are part of the VT-x layer of the Virtual Machine Extension (VMX) mode, that supports virtualization. The VMPTR component on each CPU points to a VMCS structure, which is used by the Virtual Machine Monitor (VMM) (here the kernel) to control the launching and exiting of guest processes or “subjects.” The CR3 register (which is technically one of the control registers in the CPU component) and the EPTP component (set by the active VMCS pointed to by the VMPTR) control the virtual-to-physical address translation for instructions that access memory. On the top-most layer we show the kernel code (abstracted as a program) that runs on each CPU. The kernel code has two components: an “Init” component that runs on system initialization, and a “Handler” component that handles VM exits due to interrupts.
We are interested in the VMX mode operation of this processor, in which the kernel essentially runs as a VMM, and subjects run as guest software on Virtual Machines (VMs) provided by the VMM. Subjects could be bare-metal application programs, or a guest operating system (called a VM-subject in Muen). A VM is specified using a VM Control Structure (VMCS), which stores information about the guest processor state including the IP, SP, and the CR3 control register values. It also stores values that control the execution of a subject, like the VMX-timer which sets the time slice for the subject to run before the timer causes it to exit the VM, and the extended page table pointer (EPTP) which translates guest physical addresses to actual physical addresses. It also contains the processor state of the host (the kernel). To launch a subject in a VM, the kernel sets the VMPTR to point to one of the VMCSs (from an array of VMCSs shown in the figure) using the VMPTRLD instruction, and then calls VMLAUNCH. The launch instruction can be thought of as setting the Timer, CR3, and EPTP components in the VT-x layer, from the VMCS fields. A subject is caused to exit its VM and return control to the kernel (called a VM exit), by certain events like VMX-timer expiry, page table exceptions, and interrupts.
We would like to view such a processor system as a machine of the type described in Sec. 2.1. The state of the machine is the contents of all its components. The main operations here are as follows.
Init: The init code of the kernel is executed on each of the processors, starting with CPU0 which we consider to be the bootstrap processor (BSP).
Execute: This operation takes a CPU id and executes the next instruction pointed to by the IP on that CPU. The instruction could be one that does not access memory, like an instruction that adds the contents of one register into another, which only reads and updates the register state on that CPU. Or it could be an instruction that accesses memory, like moving a value in a register to a memory address . The address will be translated via the page tables pointed to by the CR3 and EPTP components, successively, to obtain an address in physical memory which will then be updated with the contents of register . Some instructions may cause an exception (like an illegal memory access), in which case we assume the exception handler runs as part of this operation.
Event: We consider three kinds of events (or interrupts). One is the timer tick event on a CPU. This causes the Time-Stamp Counter (TSC) on the CPU to increment, and also decrements the VMX-Timer associated with the active VM. If the VMX-Timer becomes 0, it causes a VM exit, which is then processed by the corresponding handler. Other events include those generated by a VMCALL instruction, and external interrupts. Both these cause a VM exit. The cause of all VM exits is stored in the subject’s VMCS, which the handler checks and takes appropriate action for.
4 Policy Input Specification
The input specification to Muen is an XML file called a policy
. It specifies details of the host processor, subjects to be run, and a precise schedule of execution on each CPU of the host processor system. Details of the host processor include the number of CPUs, the frequency of the processor, and the available host physical memory regions. It also specifies for each device (like a keyboard) the IO Port numbers, the IRQ number, the vector number, and finally the subject to which the interrupt should be directed.
The policy specifies a set of named subjects, and, for each subject, the size and starting addresses of the components in its virtual memory. These components include the raw binary image of the subject, as well as possible shared memory “channels” it may have declared. A channel is a memory component that can be shared between two or more subjects. The policy specifies the size of each channel, and each subject that uses the channel specifies the location of the channel in its virtual address space, along with read/write permissions to the channel. Fig. 5 depicts a channel shared by subjects (with write permission) and (with read permission).
The schedule is a sequence of major frames, that are repeated in a cyclical fashion. Each major frame specifies a schedule of execution for each CPU, for a common length of time. Time is measured in terms of ticks, a unit of time specified in the policy. A major frame specifies for each CPU a sequence of minor frames. Each minor frame specifies a length in ticks, and the subject to run in this frame. A subject can be assigned to only one CPU in the schedule. An example scheduling policy in XML is shown in Fig. 6(a), while Fig. 6(b) shows the same schedule viewed as a clock. Each CPU is associated with one track in the clock. The shaded portion depicts the passage of time (the tick count) on each CPU.
Ticks are assumed to occur independently on each CPU, and can result in a drift between the times on different CPUs. The scheduling policy requires that before beginning a new frame, all CPUs must complete the previous major frame. The end of a major frame is thus a synchronization point for the CPUs.
5 Muen Kernel Generator
Given a policy , the Muen system generator  generates the components of a processor system , which is meant to run according to the specified schedule. This is depicted in Fig. 4, where the Muen toolchain generates the shaded components of the processor system, like the initial memory contents, page tables, and kernel code. We describe these components in more detail below.
The Muen toolchain first generates a layout of the memory components of all the subjects, in physical memory. The contents of these components (like the binary file for the binary component, and the zeroed-out contents of the memory channels) are filled into an image that will be loaded into physical memory before execution begins. Based on this layout, the toolchain then generates the page tables for each subject so that when the subject is running and its page table is loaded in the CR3/EPTP registers, the virtual addresses will be correctly mapped to the physical locations.
The Muen toolchain then generates a kernel for each CPU, to orchestrate the execution of the subjects according to the specified schedule on that CPU. The kernel is actually a template of code written in Spark Ada, and the Muen toolchain generates the constants for this template based on the given policy. The kernel uses data structures like to store details like the page table address and VMCS address for each subject, and to store the state of general purpose registers for a subject. To implement scheduling, the kernel uses the structure which is a multidimensional array representing the schedule for each CPU. The structure is generated by the toolchain to represent the table which maps an interrupt vector to the corresponding destination subject and the destination vector to be sent to the destination subject. The kernel also uses a data structure called for each subject to save pending interrupts when the destination subject is not active. These structures and variables are shown in Fig. 7.
The kernel knows the number of ticks elapsed on each CPU from the TSC register. It uses a shared variable called CMSC (“Current Major Start Cycle”) to keep track of the start of the current major frame. CMSC is initialized to the value of the TSC on the BSP, at the time the schedule begins. Thereafter it is advanced in a fixed periodic manner, based on the specified length of major frames, whenever a major frame is completed. This is depicted in Fig. 6(c).
We now explain the specific Init and Handler parts of the kernel. In the initialization phase the kernel sets up the VMCS for each subject. For this the Init routine reads the structure generated by Muen, to fill in fields like the page table addresses and IP and SP register values, in each subject’s VMCS. The kernel on each CPU finally sets the VMX Timer value in the VMCS for the subject that begins the schedule, loads its VMCS address in the VMPTR, and does a VMLAUNCH.
The handler part of the kernel is invoked whenever there is a VM exit. If the exit is due to a VMX Timer expiry, the handler checks whether it is at the end of the major frame by looking at . If it is not at the end of a major frame, it just increments the current minor frame. If it is at the end of a major frame, and not all CPUs have reached the end of the current major frame, the current CPU waits in a sense-reversal barrier . When all other CPUs reach the end of the current major frame, they cross the barrier and the frame pointers are updated by the BSP. The VMX Timer for the subject to be scheduled next is set to the time remaining in the current minor frame, calculated using the fact that (TSC - CMSC) time has already elapsed. The kernel then does a VMLAUNCH for the subject.
If the exit is due to an external interrupt with some vector , the VM exit handler finds out the destination subject and the destination vector corresponding to from . Then the handler sets the bit corresponding to the destination vector in for the destination subject. Whenever the destination subject is ready to handle the interrupt, the VM exit handler of the kernel injects the pending interrupt and clears the entry for it in the structure.
In this work we focus on Ver. 0.7 of Muen. The Muen tool chain is implemented in Ada and C, and comprises about 41K lines of code (LoC). The kernel template (in Spark Ada) is about 3K LoC.
6 Proof Overview
Given a policy , let denote the processor system generated by Muen. Let denote an abstract machine spec for the system (we describe in the next section). Our aim is to show that for each valid policy , refines . We use the parametric refinement technique of Sec. 2 to verify the functional correctness of the Muen system. Fig. 8 shows how we achieve this. We first define a parametric program that models the generic system generated by Muen, so that for a given policy , if corresponds to the parameter values generated by Muen, then and are behaviourally equivalent. In a similar way we define the abstract parametric program , so that with appropriate parameters , captures the abstract spec . Next we show that parametrically refines w.r.t. a condition . Fig. 8 shows the proof artifacts and obligations. Finally, for a given policy , we check that the parameter values and satisfy the condition .
In the next few sections we follow this outline to define the components and , the parametric refinement between and , and finally the checking of the condition for given policy configurations.
7 Abstract Specification
We describe an abstract specification that implements in a simple way the intended behaviour of the system specified by a policy . In each subject is run on a separate, dedicated, single-CPU processor system . The system has its own CPU with registers, and bytes of physical memory . For each subject we have a similar sized array called which gives the permissions (read/write/exec/invalid) for each byte in its virtual address space. The policy maps each subject to a CPU of the concrete machine on which it is meant to run. To model this we use a set of logical CPUs (corresponding to the number of CPUs specified in the policy), and we associate with each logical CPU, the (disjoint) group of subjects mapped to that CPU. Fig. 9 shows a schematic representation of .
To model shared memory components like channels, we use a separate memory array , as depicted in Fig. 5. We assume a partial function which maps the address space of subject to the channel memory. For any access to memory, it is first checked whether that location is in a channel using another Boolean function called , and if so, the content is updated/fetched from the address in given by the function.
We have used an array of bit-vectors (one for each subject) called to implement interrupt handling. When bit of the bit vector for a subject is set, it represents that the interrupt vector is pending and has to be handled by subject .
There is no kernel in this system, but a supervisor whose job is to process events directed to a logical CPU or subject, and to enable and disable subjects based on the scheduling policy and the current “time”. Towards this end it maintains a flag for each subject , which is set whenever the subject is enabled to run based on the current time.
To implement the specified schedule, it uses structures and , similar to the Muen kernel. However it keeps track of time using the clock-like abstraction depicted in Fig. 6(b), with the per-logical-CPU variables (counts each tick event modulo the total schedule cycle length), (reset at the end of every major frame), (ideal major frame pointer), (minor frame pointer), and (cycle counter). It also uses a global major frame pointer called , and a global cycle counter . We use two kinds of pointers here – ideal and global. The values of and together tell us the absolute time (number of ticks elapsed) on each logical CPU, and tells us the major frame based on this time. The global pointers however are constrained by having to synchronize on the major frame boundaries, and are updated only when every CPU has completed the major frame. We say a CPU is enabled if the global and ideal values of cycles match for the CPU, and the global and ideal values of majorfp match.
In the operation the supervisor initializes the processor systems , permissions array , the channel memory , and also the schedule-related variables, based on the policy. The operation, given a logical CPU id, executes the next instruction on the subject machine currently active for that logical CPU id. If the CPU is not enabled, the operation is defined to be a no-op. An operation does not affect the state of other subject processors, except possibly via the shared memory . If the instruction accesses an invalid memory address, the system is assumed to shut down in an error state. Finally, for the operation, which is a tick/interrupt event directed to a logical CPU or subject, the supervisor updates the scheduling state, or pending event array, appropriately.
To represent the system concretely, we use an Ada program which we call . is a programmatic realization of , with processor registers represented as 64-bit numeric variables, and memory as byte arrays of size . The operations , , and are implemented as methods that implement the operations as described above.
Finally, we obtain a parametric program from , by parameterizing it as illustrated in Sec. 2. Thus we declare constants like MAXSub, MAXCPU, MAXMajfr, and MAXMinfr to represent the universe of parameter values we allow. We then declare parameter variables of these size types, like (for “Abstract Number of Subjects”) of type 1..MAXSub, which will act as parameters to be initialized later. We call the list of parameters . We then declare other variables and corresponding arrays of these sizes (e.g. the array of per-subject CPUs will have size ).
By construction, it is evident that if we use the values for the respective parameters in , we will get a machine program which is equivalent in behaviour to .
8 Muen System as a Parametric Program
We now describe how we model the system generated by Muen as a parametric program. Let be a given policy. To begin with we define a machine program that represents the processor system generated by Muen. This is done similar to the abstract specification in Sec. 7, except that we now have a single physical memory array which we call . Further, since the processor system makes use of the VT-x components, we need to model these components in as well. To begin with we represent each page table, represented by an identifier , as a size array of 64-bit numbers, with the translation of an address being modelled as . We model each VMCS as a structure with fields as defined in . We use per-CPU variables and that could contain a VMCS identifier or a page table identifier respectively. We also implement instructions like VMRITE and VMPTRLD as method calls which read/update these program variables and structures. We also include all the kernel structures described in Sec. 5 (like and ) as global structures in . For each subject , contains a page table id which we call .
The operations , , and are implemented as method calls, similar to the abstract spec. The code comes from the Init component of the kernel (see Sec. 5). We also initialize the physical memory with the image produced by the Muen toolchain. In the method, memory accesses are translated via the active page table to access the physical memory . The implementation of the operation comes from the Handler part of the kernel code (in particular for handling VM exits due to timer expiry and interrupts).
Finally, we move from to a parametric program . This is done in a similar way as in Sec. 2 and 7. Apart from the constants like MAXSub, we use the parameters , , , , , , and . We refer to this list of parameters as , and refer to the resulting parametric program as . Once again, for an appropriate list of values corresponding to a given policy , we believe that is equivalent to , which in turn is equivalent to .
9 Parametric Refinement Proof
We now show that the parametric version of the Muen system conditionally refines the parametric abstract spec . From Sec. 2.2, this requires us to identify the condition , and find a gluing relation on the state of parametric programs and such that the refinement conditions (type), (init), and (sim) are satisfied.
We use a condition whose key conjuncts are the following conditions:
: The page tables associated with a subject must be injective in that no two virtual addresses, within a subject or across subjects, may be mapped to the same physical address, unless they are specified to be part of a shared memory component. More precisely, for each , and addresses , , we have:
: For each subject , the permissions (rd/wr/ex/present) associated with an address should match with the permissions for in .
: For each subject , no invalid virtual address is mapped to a physical address by page table .
: For each valid address in a subject the initial contents of in should match with that of in the physical memory .
: The values of the parameters (like , , and ) in the concrete should match with those in the abstract.
The gluing relation has the following key conjuncts:
The CPU register contents of each subject in the abstract match with the register contents of the CPU on which the subject is active, if the subject is enabled, and with the subject descriptor, otherwise.
For each subject , and valid address in its virtual address space, the contents of and should match.
The value on each CPU in the concrete, should match with how much the ideal clock for the subject’s logical CPU is ahead of the beginning of the current major frame in the abstract.
The major frame pointer in the abstract and concrete should coincide, and the minor frame pointers should agree for enabled CPUs.
On every enabled CPU, the sum of the VMX Timer and should equal the deadline of the current minor frame on that CPU.
A CPU is waiting in a barrier in the concrete whenever the CPU is disabled in the abstract and vice-versa.
The contents of the abstract and concrete pending event tables should agree.
We carry out the adequacy check for , described in Sec. 2.2, by constructing a “combined” version of and that has the disjoint union of their state variables, as well as a joint version of their operations, and phrase the adequacy conditions as pre/post conditions on the joint operations. In particular, for the (init) condition we need to show that assuming the condition holds on the parameters, if we carry out the joint operation, then the resulting state satisfies the gluing relation . To check the (sim) condition for an operation , we assume a joint state in which the gluing relation holds, and then show that the state resulting after performing the joint implementation of , satisfies . Once again this is done assuming that the parameters satisfy the condition . We carry out these checks using the Spark Ada tool  which given an Ada program annotated with pre/post assertions, generates verification conditions and checks them using provers Z3 , CVC4 , and Alt-Ergo .
We faced several challenges in carrying out this proof to completion. A basic requirement of the gluing relation is that the abstract and physical memory contents coincide via the page table map for each subject. After a write instruction, we need to argue that this property continues to hold. However, even if one were to reason about a given concrete page table, the prover would never be able to handle this due to the sheer size of the page table. The key observation we needed was that the actual mapping of the page table was irrelevant: all one needs is that the mapping be injective as in condition . With this assumption the proof goes through easily. A second challenge was proving the correctness of the way the kernel handles the tick event. This is a complex task that the kernel carries out, requiring 8 subcases to break up the reasoning into manageable subgoals for both the engineer and the prover. The presence of the barrier synchronization before the beginning of a new major frame, adds to the complexity. The use of auxiliary variables (like an array of CPUs), and the case-split helped us to carry out this proof successfully.
Finally, modelling the interrupt handling system, including injection of interrupts in a virtualized setting, was also a complex task. Here too we had to split the proof of the correctness of interrupt event into multiple subcases (5 in this case).
Table 1 shows details of our proof effort in terms of lines of code (LoC) and lines of annotations (LoA) in the combined proof artifact. All proof artifacts used in this project are available at https://bitbucket.org/muenverification/.
10 Checking Condition
We now describe how to efficiently check that for a given policy , the parameters generated by Muen and those of the abstract specification, satisfy the condition . Let be the maximum size of the virtual address space, and the actual used space, for a subject. is typically (approx 256T) while is of the order of 1.5G. Let denote the size of a paging structure in Bytes, which is of the order of 500K. A naive way to check the conditions (injectivity) and (invalid virtual addresses are not mapped), would be to use a bit array for physical memory and iterate over all virtual addresses, marking the bit corresponding to this mapped physical address. We ensure that we never mark a bit that is already marked () and that we never map an invalid virtual address (). However, this runs in time proportional to , and such an algorithm would take days to run. In contrast, we give a way to check in time proportional to for each subject, and in time proportional to plus the size of a page table .
To check and , we exploit the fact that the Muen toolchain generates an intermediate representation of physical memory (called the “B-policy”) which defines the physical address, size and the initial content of the physical memory segments used by Muen. We first check that these components have been laid out disjointly in the physical layout given in the B-policy, by sorting them based on base address and checking for overlap. Next we check whether the page tables are generated according to the layout in the B-policy. For this purpose we translate each valid physical address of a subject through its corresponding page table and check that it matches with the address generated by the tool chain. Algorithm 1 shows our algorithm for checking conditions and .
Here is a function to check read/write/exec permissions of the address in the page table of subject .
To check efficiently, we observe that the translation of a valid virtual address makes use of certain 64 bit words in the paging structure. The present-bits in only these words should be set, and all others should be unset. To check this we use an array PTBitArray of bits, one for each 64 bit word in the paging structure. We translate each valid virtual address, and set the bits in the array PTBitArray that correspond to each word accessed in the translation. After translating all valid addresses of the subject, we check that there is no word in the paging structure with the present-bit set but associated array bit unset. This is shown in Algorithm 2. We note that this procedure runs in time .
Condition (initial memory contents) is straightforward to check. We simply compare the content of the image generated by Muen with each individual component’s content (which is a specified file or “fill” element) byte by byte.
Condition is checked by algorithmically generating the abstract parameter data structures and ensuring that the Muen generated ones conform to them.
We implemented our algorithms above in C and Ada, using the Libxml2 library to process policy files, and a Linux utility xxd to convert the Muen image and individual files from raw format to hexadecimal format.
We ran our tool on 16 system configs, 9 of which (D7-*,D9-*) were available as demo configurations from Muen. The remaining configs (DL-*) were configured by us to mimic a Multi-Level Security (MLS) system from . Details of representative configs are shown in Table 2. For each configuration the table columns show the number of subjects, number of CPUs, the size of physical memory needed on the processor, the ISO image size generated by the Muen toolchain, the time taken by our condition checking tool, and finally whether the check passed or not.
We used the 3 configs D9-* (from Ver. 0.9 of Muen) as seeded faults to test our tool. Ver. 0.9 of Muen generates implicit shared memory components, and this undeclared sharing was correctly flagged by our tool.
The average running time on a configuration was 5.6s. The experiments were carried out on an Intel Core i5 machine with 4GB RAM running Ubuntu 16.04.
11 Basic Security Properties
While we believe that the property we have proved for Muen in this paper (namely conformance to an abstract specification via a refinement proof) is the canonical security property needed of a separation kernel, security standards often require some specific basic security properties to be satisfied. We discuss below how some of these properties mentioned in [12, 16] follow from the verification exercise we have carried out for Muen.
Let us consider a system generated by Muen, and the abstract specification , for a given policy . Let us further assume that the generated parameters satisfy the condition check of Step 2. Then we know that every (error-free) sequence of operations in the concrete system can be simulated, via the gluing relation built on injective page tables, by the abstract system . We now argue that the system must satisfy the specific security properties below.
No exfiltration This property states that the execution of a subject does not influence the state of another subject. In our setting, we take this to mean that a write to a memory address by a subject does not affect the memory contents of subject , assuming address is not part of a declared memory channel between and . Let us consider a sequence of operations in leading to state where subject performs an exfiltrating write to address . Now by our refinement proof, the abstract system must be able to simulate the same initial sequence of operations and reach a state which is glued to via the gluing relation , which in turn is based on page table maps satisfying the injective property . Now when makes an exfiltrating write to address to take the concrete state from to , the memory contents of must change in going from state to . However when we perform the same write in the abstract state , the memory contents of do not change in going from to . It follows that the state cannot be glued via to the concrete state . This contradicts the simulation property of our proof. Thus, it follows that no subject can perform an exfiltrating write. A similar argument holds to show that cannot change the other components (like the register contents) of the state of .
No infiltration This property states that an operation by a subject should not be influenced by the state of another subject . More precisely, suppose we have two concrete states and of in which the state of subject is identical. In our setting this means that and are glued, respectively, to abstract states and in which the states of are identical. Now suppose subject performs an operation (say a read of a memory location) in and to reach and respectively. Then the state of in and should be identical. However, this follows from our proof, since by construction the state of in and obtained by performing the same operation in the abstract states and respectively, must be identical. Since and must be glued to and respectively, the property follows.
Temporal separation This property states that subjects are executed according to the specified schedule, and that while they are inactive their state does not change. The latter could happen for instance if the register state of the previously executing subject was exposed by not restoring the current subject’s state correctly. Once again this property follows in our setting since every sequence of operations by must be matched by the abstract specification, and by construction the abstract specification executes according to the specified policy and the state of a subject does not change while inactive.
We note that the property of non-bypassability from  would require the above three properties to hold.
Kernel integrity This property states that the kernel state, including its code and data, is not affected by the operations carried out by a subject. This property is called the tamper-proof property in . Though this property is not directly modelled in our setting (note that we model the kernel code and data as a high-level program that cannot be accessed by subjects), while checking condition we also check that the page tables generated by Muen satisfy the injective property across all memory components, including the kernel components, as specified in the B-policy of Muen. This effectively ensures the integrity of the kernel.
The validity of the verification proof carried out in this work depends on several assumptions we have made. Some implicit assumptions we have made include the fact that processor hardware components like page table translation and VMX instructions behave the way we have modelled them.
In addition there are several explicit assumptions related to the way we have modelled the abstract specification of how the SK is expected to behave:
When scheduling actually begins after the initialization routines, the TSCs on all CPUs have the same value.
If any subject performs an illegal instruction (like accessing an invalid memory address) the system halts in an error state.
The tick count on the 64-bit TSC counter does not overflow (this is a mild assumption as it would take years to happen); Similarly we assume that a minor frame length is never more than ticks as the VMX Timer field is only 32-bits wide.
If any of these assumptions are violated, the proof will not go through, and in fact we would have counter-examples to conformance with the abstract specification.
Finally, we show the various components used in our verification in Fig. 10. Each box represents a automated tool (full boxes) or manual transformation carried out (dashed boxes). Components that we trust in the proof are unshaded, while untrusted components are shown shaded.
We would like to mention that the developers of Muen were interested in adding our condition checking tool to the Muen distribution, as they felt it would strengthen the checks they carry out during the kernel generation. We have updated our tool to work on the latest version (v0.9) of Muen, and handed it over to the developers.
13 Related Work
We classify related work based on general OS verification, verification of separation kernels, and translation validation based techniques.
Operating System Verification.
There has been a great deal of work in formal verification of operating system kernels in the last few decades. Klein  gives an excellent survey of the work till around 2000. In recent years the most comprehensive work on OS verification has been the work on seL4 , which gave a refinement-based proof of the functional correctness of a microkernel using the Isabelle/HOL theorem prover. They also carry out an impressive verification of page table translation . The CertiKOS project  provides a technique for proving contextual functional correctness across the implementation stack of a kernel, and also handles concurrency. Other recent efforts include verification of a type-safe OS , security invariants in ExpressOS , and the Hyperkernel project .
While verification of a general purpose OS is a more complex task than ours—in particular a general kernel has to deal with dynamic creation of processes while in our setting we have a fixed set of processes and a fixed schedule—the techniques used there cannot readily reason about generative kernels like Muen. We would also like to note here that while it is true in such verification one often needs to reason about parametric components (like a method that computes based on its parameters), the whole programs themselves are not parametric. In particular, a standard operating system is not parametric: it begins with a concrete initial state, unlike a parametric program in which the initial state has unitialized parameters. Thus the techniques developed in this paper are needed to reason about such programs.
Finally, we point out that none of these works address the use of VT-x virtualization support.
Verification of Separation Kernels.
There has been substantial work in formal verification of separation kernels/hypervisors. seL4  can also be configured as a separation kernel, and the underlying proof of functional correctness was used to prove information flow enforcement. Heitmeyer et al  proved data separation properties using a refinement-based approach for a special-purpose SK called ED, in an embedded setting. As far as we can make out these systems are not generative in nature, and either do not use or do not verify hardware virtualization support. Additionally, unlike our work, none of these works (including OS verification works) are post-facto: they are developed along with verification.
Dam et al  verify a prototype SK called PROSPER, proving information flow security on the specification and showing a bisimulation between the specification and the implementation. PROSPER works for a minimal configuration with exactly two subjects, and is not a generative system. The Verisoft XT project  attempted to prove the correctness of Microsoft’s Hyper-V hypervisor  and Sysgo’s PikeOS, using VCC . While the Hyper-V project was not completed, the PikeOS memory manager was proved correct in . Sanan et al  propose an approach towards verification of the XtratuM kernel  in Isabelle/HOL, but the verification was not completed.
Translation Validation Techniques.
Our verification problem can also be viewed as translation validation problem, where the Muen generator translates the input policy specification to an SK system. The two kinds of approaches here aim to verify the generator code itself (for example the CompCert project ) which can be a challenging task in our much less structured, post-facto setting; or aim to verify the generated output for each specific instance . Our work can be viewed as a via-media between these two approaches: we leverage the template-based nature of the generated system to verify the generator conditionally, and then check whether the generated parameter values satisfy our assumed conditions.
In this work we have proposed a technique to reason about template-based generative systems, and used it to carry out effective post-facto verification of the separation property of a complex, generative, virtualization-based separation kernel. In future work we plan to extend the scope of verification to address concurrency issues that we presently ignore in this work.
Acknowledgement. We thank the developers of Muen, Reto Buerki and Adrian-Ken Rueegsegger, for their painstaking efforts in helping us understand the Muen separation kernel. We also thank Arka Ghosh for his help in the proof of interrupt handling.
-  (2010) Modeling in event-b - system and software engineering. Cambridge University Press. External Links: Cited by: §2.1, §2.1.
-  (2018) GNAT Pro ada toolsuite. Note: https://www.adacore.com/gnatpro Cited by: §1, §2.1, §9.
-  (2011) CVC4. In Proc. 23rd Computer Aided Verification (CAV) Snowbird, USA, 2011., Lecture Notes in Computer Science, Vol. 6806, pp. 171–177. External Links: Cited by: §9.
-  (2011) Proving memory separation in a microkernel by code level verification. In Proc. 14th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops, ISORC Workshops 2011, Newport Beach, USA, pp. 25–32. Cited by: §13.
-  (2009-06) Verifying the PikeOS Microkernel: First Results in the VerisoftXT Avionics Project. Research Gate, pp. . Cited by: §13.
-  (2013) Muen - An x86/64 Separation Kernel for High Assurance. Technical report University of Applied Sciences Rapperswils (HSR). External Links: Cited by: §1, §5.
-  (2009) VCC: A practical system for verifying concurrent C. In Proc. Theorem Proving in Higher Order Logics, 22nd International Conference, TPHOLs 2009, Munich, 2009, pp. 23–42. Cited by: §13, §2.1.
-  (2012-12) SMT techniques and their applications: from Alt-Ergo to Cubicle. Thèse d’habilitation, Université Paris-Sud. Note: Tool URL http://alt-ergo.lri.fr/ External Links: Cited by: §9.
-  (2010) Partitioned Embedded Architecture Based on Hypervisor: The XtratuM Approach. In Eighth European Dependable Computing Conference, EDCC-8 2010, Valencia, Spain, pp. 67–72. Cited by: §13.
-  (2013) Formal verification of information flow security for a simple ARM-based separation kernel. In Proc. ACM Conference on Computer and Communications Security, CCS 2013, pp. 223–234. Cited by: §13.
-  (2008) Z3: an efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Budapest, Hungary, pp. 337–340. External Links: Cited by: §9.
-  (2007) U.S. Government Protection Profile for Separation Kernels in Environments Requiring High Robustness, Version 1.03 29 june 2007. Note: https://www.niap-ccevs.org/Profile/Info.cfm?id=65 Cited by: §11, §11, §11.
-  (2014) Efficient refinement checking in VCC. In Verified Software: Theories, Tools and Experiments (VSTTE) 2014, Vienna, Austria, July 17-18, 2014, pp. 21–36. External Links: Cited by: §2.1.
-  (2019) INTEGRITY multivisor. Note: https://www.ghs.com/products/rtos/integrity_virtualization.html Cited by: §1.
-  (2016) CertiKOS: an extensible architecture for building certified concurrent OS kernels. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016., pp. 653–669. External Links: Cited by: §13.
-  (2006) Formal specification and verification of data separation in a separation kernel for an embedded system. In Proc. 13th ACM Conference on Computer and Communications Security (CCS), Alexandria, 2006, pp. 346–355. Cited by: §1, §11, §13.
-  (2011) The art of multiprocessor programming. Morgan Kaufmann. Cited by: §5.
-  (1985-05) Data Refinement Refined (DRAFT). Technical report Technical Report , Oxford University Computing Laboratory, Oxford, UK. Cited by: §1, §2.1.
-  (1975) Proof of correctness of data representation. In Language Hierarchies and Interfaces, International Summer School, Marktoberdorf, Germany, 1975, Lecture Notes in Computer Science, Vol. 46, pp. 183–193. External Links: Cited by: §1, §2.1.
-  (2018-05) Intel 64 and ia-32 architectures software developer’s manual - volume 3c. Intel Corporation. Cited by: §1, §3, §8.
-  (2014) Comprehensive formal verification of an OS microkernel. In ACM Transactions on Computer Systems, Vol 32, Article 2, pp. 1–70. Cited by: §13, §13.
-  (2009) seL4: formal verification of an OS kernel. In Proc. 22nd ACM Symposium on Operating Systems Principles 2009, Big Sky, pp. 207–220. Cited by: §1.
-  (2009) Operating system verification an overview. In Sadhana 34, 1, pp. 27–69. Cited by: §13.
-  (2009) Verifying the Microsoft Hyper-V Hypervisor with VCC. In Symposium on Formal Methods (FM 2009), LNCS, Vol. 5850, Eindhoven, pp. 806–809. Cited by: §13.
-  (2006) Formal certification of a compiler back-end or: programming a compiler with a proof assistant. In Proc. 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2006, Charleston, USA, pp. 42–54. Cited by: §1, §13.
-  (2019) LynxSecure Separation Kernel Hypervisor. Note: http://www.lynx.com/products/secure-virtualization/lynxsecure-separation-kernel-hypervisor/ Cited by: §1.
-  (2013) Verifying security invariants in ExpressOS. In Proc. Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, 2013, pp. 293–304. Cited by: §13.
-  (2009-09) XtratuM: a Hypervisor for Safety Critical Embedded Systems. In 11th Real Time Linux Workshop, pp. . Cited by: §1.
-  (2017) Hyperkernel: push-button verification of an os kernel. In Proc. 26th Symposium on Operating Systems Principles, Shanghai, China, pp. 252–269. Cited by: §13.
-  (1998) Translation validation. In Proc. Tools and Algorithms for Construction and Analysis of Systems (TACAS), volume 1384 of Lecture Notes in Computer Science, pp. 151–166. Cited by: §1, §13.
-  (1981) Design and verification of secure systems. In Proc. 8th Symposium on Operating System Principles (SOSP), Pacific Grove, USA, pp. 12–21. Cited by: §1, §1.
-  (1982) Proof of separability: A verification technique for a class of a security kernels. In International Symposium on Programming, 5th Colloquium, Torino, Italy, April 6-8, 1982, pp. 352–367. Cited by: §10.
-  (2014) Separation Kernel Verification: The Xtratum Case Study. In Proc. 6th International Conference on Verified Software: Theories, Tools and Experiments (VSTTE), Vienna, Austria, 2014, pp. 133–149. Cited by: §13.
-  (2018) PikeOS 4.2 Hypervisor. Note: https://www.sysgo.com/products/pikeos-hypervisor/ Cited by: §1.
-  (2004) Verifying the L4 virtual memory subsystem. In Proc. NICTA Formal Methods Workshop on Operating Systems Verification, G. Klein (Ed.), pp. 73–97. Cited by: §13.
-  (2019) VxWorks MILS Platform. Note: https://www.windriver.com/products/product-overviews/vxworks-mils-product-overview/ Cited by: §1.
-  (2010) Safe to the last instruction: automated verification of a type-safe operating system. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Toronto, Canada, pp. 99–110. External Links: Cited by: §13.