Microkernels Meet Recursive Virtual Machines
Ford, Hibler, Lepreau, Tullmann, Back, Clawson (1996)
What kind of paper is this?
- Trying to be a big idea -- nested structure as a bridge between
VMs and microkernels
- Really describes a particular system.
The big idea
- Stack virtual machines on top of a microkernel.
Microkernel runs on bare hardware.
- Microkernel is a VMM that exports a virtualizable machine, not an actual machine.
- VMs on this VMM are called nesters or nested processes.
- Address space are composed from other address spaces (think Leidtke).
- Share CPU via hierarchical scheduling.
- Global capability model.
Design Goal
- Maintain maximum performance in the presence of deep virtual machine layering.
- Nested process architecture
- Preserve utility of a pure nested virtual machines without incurring the
exponential overhead
- Notice that this paper predates Disco!
- Also note that many of the problems cited are research areas
investigated
in the resurgence of virtualization.
Terminology
- VMM: Exports a virtual machine with the same architecture as the HW
- Virtual machine Simulator: Exports something other
than the architecture
(think JVM).
- Pure recursive Machine : Emulates exactly and entirely the
layer below.
Key Concepts
- State Encapsulation
- Allows nesting one process inside another.
- Child process state is a subset of parent's.
- Parent can see child's state (this is not always the case
in a VMM architecture)
- Think of the fact that you can create a file image of
a virtual machine and move it around; that is a symptom of
stat encapsulation.
- Requirements
- Hierarchical resource management (destroy a child and you
destroy all descendents of that child)
- References must be relative to the parent; not absolute
to some underlying base
- Border Control
- Ability to monitor communication across a border
Nested Process Architecture
- Fluke is the architecture between each stackable layer
- Three components:
- Basic instruction set -- provided by HW -- any ISA that supports
state encapsulation and border protection; eliminate sensitive but
unprivileged instructions (e.g., CPUID).
- Low level system call API -- provided by the microkernel; ensures that a
parent never needs to interpose child calls; just needs to control resources.
- IPC-based Common Protocols -- implemented at each layer -- performance
critical activities are in the Low Level API; Any other services are implemented
via the IPC protocols
Address Spaces
- Stripped down processes or tasks
- Can support multiple threads of control via thread objects, scheduled
by the kernel.
- No malloc/free, just remap
- The movement of memory among parents/children sounds a lot like Leidtke
- Parents donate to children
- Parents can revoke
Capabilities
- All references between low level objects are represented as
capabilities
- Primitive objects have capability slots
- Are virtual memory references to kernel objects (and the
kernel objects contain slots for capabilities)
- Passing capabilities allows short circuiting
- Capabilities allow selective interposition of a child
Scheduling
- Not yet implemented (tested in user space)
- Scheduling is relative
- Threads can give time to other threads
- Scheduling hierarchy does NOT need to correspond to process hierarchy
High Level Protocols
- Parent interface is the only one that all nesters interpose; it
provides name servicing.
- This is how you get your initial file descriptors
- The nesters can, however, implement whatever they need in these
APIs.
Implementation
- The Microkernel
- Raw x86 microkernel
- Written in C using Flux OS Toolkit
- Pre-emptible
- Multiproessor locking
- Contains built in drivers for: serial portl networkl disk (allows
out of kernel drivers too)
- Implements Common Protocols
- Called root process
- The Libraries
- Library implements POSIX system calls
- Two versions: full BSD library; stripped down one for nesters
- Nesting library (optional):
Parent-side of the C library -- spawn children
and forward child's common protocol requests
- Nesters support various standard kernel services:
- POSIX process management -- manages multiple children -- every UNIX
process is a child of this nester
- Demand paging -pages anonymous memory to disk
- Checkpointing -- can interpose at any level, so you could be the
child of the root and checkpoint the entire system.
Checkpointing capabilities is a little tricky.
- Debugging -- handles exceptions from a child
- Tracing -- simply records message exchanges; could use it to build
a security monitor
Fun comments:
- Contrast with L3: microkernel written for portability, readability, and
flexibility.
- Microkernel performance hit
Experimental Results
- Microbenchmarks, allegedly that dictate application performance.
- Goal: Evaluate overhead of nesting (not of the base system)
- Context: 128 MB RAM and a 200 MHz processor ...
- What are we testing:
- Memory management
- VM system
- Process management
- File reads
- Computation
- Baseline: Raw microkernel (and comparison to FreeBSD): system seems
quite adequate performance wise although the computational results appear
to take the biggest hit, which I find surprising (33% overhead).
- Interposition overhead (run variable number of tracers): only
read test slows down noticeably -- roughly 10% per nester.
- From microkernel to full functionality:
- Computation, VM, and read are mostly unaffected by adding Processes,
and memory management
- Memtest and forktest take a hit immediately
- read test only when you add tracer