Hyperkernel: Push-Button Verification of an OS Kernel

Nelson, Sigurbjarnarson, Zhang, Johnson, Bornholt, Torlak, Wang (2017)

What kind of paper is this?

Kind of incremental advance over their FS push button verification; kind of a big deal -- the idea of automatically verifying the kernel.
The definition and application of finite APIs seems new and potentially big.
One suggested story line is, "In exokernels or microkernels, designers have tried to find the right API abstractions that separate concerns to enable to kernel to be simple, reliable. We look at ways to design kernel APIs to enable verifiability. Our big idea is finitization, a new approach to API design that enables push-button verification."

My Original Story

Operating systems are (among) the most critical pieces of a software stack. Once upon a time, some folks in Australia build a verified microkernel, called seL4. Unfortunately this was a collosal undertaking -- it required years and years of work for a relatively tiny kernel. However, being able to verify the correctness of an operating system is highly desirable. The authors present a different approach: redesigning the interfaces so that they can be specified in declarative functional manner. They then developed techniques to combine the implementation, expressed as LLVM IR, and the specifications to produce either a proof of correctness or a test case demonstrating non compliance. Using the Z3 solver, they were able to verify the 50-call kernel, and everyone lived happily ever after.

The System

Based on xv6 (MIT teaching OS)
50 system calls and other trap handlers
Kernel is not pre-emptive
Use their python specification language to express APIs so they can be translated to SMT expressions for verification with ...
Use the LLVM implementation derived from the C to translate to SMT, so that it can then be reconciled with the SMT expressions from the interface.
Finitizes the kernel interface (no unbounded loops or recursion)
Separate user/kernel address space (kernel is identity mapped) -- use Intel virtualization support to give kernel and user processes separate page tables.
Uses Z3
Low proof burden (relative to seL4)
It's relatively fast: proves the kernel in about 15 minutes on an 8-core system.
Omits kernel initialization and glue code (i.e., save/restore registers)
Whoa: This is cool, "We have developed several applications, including a Linux binary emulator and a web server that can host the Git repository of this paper."

Finite Interfaces

Assumptions: interrupts are disabled; kernel and user are in separate address spaces.
Every trap handler can be expressed as a set of traces of bounded length.
Ensure scalability by making sure that the bounds are independent of anything big (e.g., number of pages, size of a file, etc).
Example
- Dup semantics guarantee that you get the lowest FD available.
- That means that an implementation must scan the FD table.
- This is then a function of the number of FDs, which grows and therefore can make verification slow.
- So, they use a different interface: dup(oldfd, newfd). In practically all cases where you care which FD gets allocated, you know precisely where you want it allocated (e.g., creating a pipe with stdin/stdout). If you don't care, you can just pick a largish random number (and if it's in use the kernel call can fail).

API Specifications

Written in python
First, define the kernel state (e.g., FDtable, PID, etc).
Second, define state transitions for each trap handler.
Optionally include high level specifications for properties you desire of the state machine.

Verification

Two Proof Obligations:
1. The kernel implementation is a refinement of the state-machine specification.
  - Requires mapping from kernel data structures to abstract kernel state.
    - Translates both specification and IR implementation into SMT
    - Checks that they move in lock step for every transition
2. The state machine specification satisfies the declarative specification
  - Basically test for unsatisfiability of the negation of the refinement property. Use symbolic execution of the IR -- if it is unsatisfiable, then you verify correct, else, you have a satifying condition, which is the counter-example.
  - Do a similar negation proof for the declarative specification and the state transitions.
Since we're using Z3, if we have a bug, we get a counter example.
Maintaining Atomic Trap Handling
- Interrupts delayed until trap is complete (i.e., interrupts are turned off)
- DMA into a special region which is treated as volatile
TCB
- Specifications (both types)
- The theorems and equivalence functions
- Kernel initialization and glue code
- Verifier
- Toolchain: Z3, Python, LLVM
- Hardware.
Overview of the Hyperkernel
- Draws on: Dune, exokernel, seL4.
- Kernel/User each in own address space (Dune).Kernel = host; User = guest.
- Iterrupts delivered directly to user space through interrupt descriptor table. (removes kernel from most exception paths)
- Hyper kernel makes resource allocation decisions (Exo). Resources are given back from user space to avoid looping in the kernel.
- Patterns to make API finite
  - Use reference counters to track resource usage.
  - No fork/exec: createProcess creates a process with three pages and leaves everything else to user level.
  - Use arrays where possible.
  - Allow linked data structures where necessary.
Manual Checking
- Representative invariants holds before creating init.
- Check that initial state satsifies all state predicates.
- Statically check the hyperkernel call graph to estimate max stack depth. Then verify that that depth fits in a fixed amount of space (4 KB).
- Verify that symbols in IR do not overlap.
Evaluation (I love that they call it Experience)
- Super hard to evaluate.
- Would it have caught xv6 bugs? (Many.)
- How big is it: roughly 20,000 lines of C.
- Took several researchers about a year.
- Three implementations: Rust, C using conventional address spaces, Current one.
- Used their system to debug their code during development. (Counter examples were useful.)
- 45 minute verification on a single core (15 on 8).
- Performance: Linux vs Linux Emulation vs Hyperkernel port.

Hyperkernel: Push-Button Verification of an OS Kernel

Nelson, Sigurbjarnarson, Zhang, Johnson, Bornholt, Torlak, Wang (2017)

What kind of paper is this?

My Original Story

The System

Finite Interfaces

API Specifications

Verification

Maintaining Atomic Trap Handling

TCB

Overview of the Hyperkernel

Manual Checking

Evaluation (I love that they call it Experience)