RedLeaf: Isolation and Communication in a Safe Operating System

Narayanan, Huang, Detweiler, Appel, Li (2020)

We built a thing to explore a specific area.
This feels a bit different from our standard, "we built a thing" papers.
They really seem to focus on trying to understand what can be achieved in the domain of OS and OS architecture given a safe language.
I found that I really wanted to race to the related work section (in this case, section2) to figure out exactly how they claim this work differs from past work in, e.g., ML-based OS, Java-based OS, Sing#-based OS.
The intro tells an even different story: the artifact is really simply a test of the principles that the authors lay out for language-based isolation.

Lines like this really frustrate me: Early operating systems applied safe languages for operating system development [9, 14, 19, 25, 34, 55, 71, 80]. I want to know what systems they are thinking of without having to go hunting through all the references.
- 9. Never heard of the work
- 14. Emerald (talk to Norm!)
- 19. LISP
- 25. Inferno (???)
- 34. Smalltalk-80
- 55. JavaOS
- 71. Pilot
- 80. Cedar
It's unclear that all of these systems really apply. Further, they aren't doing the "related" part of related work that makes it so valuable -- explaining how the current system is different from these past efforts. Ah, they do this in the final paragraph, but that assumes that I have read everything and committed to memory everything they told me so that their comparison makes sense. This is a big "ask" of the reader.

Many efforts at imposing fine-grain isolation in operating systems has not been super successful.
This failure makes systems more brittle and less secure.
Given that Rust is specifically designed to facilitate writing systems code in a safe(r) language, designing an OS specifically around language isolation is a winning strategy.
This exploration leads to a collection of (ideally) broadly applicable principles and mechanisms for using safe languages to provide fault isolation.

Define: A language-based isolation domain is the unit of information hiding, loading and fault isolation.
No pointer in a domain can access the private heaps of another domain. (Shared heaps facilitate cross-domain communication.)
Exchangeable types can be safely shared across domains; enforce heap isolation (principle above).
Track ownership of all objects
Limit interfaces to exchangeable types
Mediate all cross domain invocations

Microkernel
A collection of isolated domains implement the kernel -- this confuses me a little -- many of the isolated domains listed would be servers in a microkernel system, so the isolated domains make sense; but they make it sound like the microkernel itself is implemented as isolated domains (still in the introduction).
Demonstrate how the isolated domains lead to crash recovery in the context of device drivers
Demonstrate performance with device drivers as well

Microkernel features
- start threads of execution
- Domain loading
- Scheduling
- Memory management
- Interrupt forwarding
Isolated domains provide:
- device drivers
- Os personality
- User applications
Everything runs in Ring 0
Domains are all in safe Rust (microkernel and trusted libraries can use unsafe Rust).
Threads move across domains using the same stack (vaguely like lxc).
Mediate cross domain calls using proxies
References to objects are capabilities. (In Rust, traits are interface requirements.)
TCB = Rust compiler, Rust core libraries, microkernel, RedLeaf crates that implement hardware interfaces and low-level abstractions, the RedLeaf IDL compiler and environment.
Assume devices are not malicious.
No protection against side channels.

Unit of fault isolation and information hiding
A domain starts with a reference to microkernel syscall interface (allows for creating threads, allocating memory, creating synch objects.
Domains can also define other entry functions and references to objects.
Threads can move between domains (and therefore outlive the domain itself).
Define fault isolation: Given a crashed domain ...
- All threads in the domain can unwind and return an error
- Subsequent attempts to use the domain fail
- All resources in the domain can be safely reclaimed.
Mechanisms to provide fault isolation
- Private and shared heaps: no pointers from a domain can reference another domain's private heap, stack, or global data. Shared heap data can be transmitted across domains. These shared heap objects have a single owner at any point in time.
- Shared objects are composed only of exchangeable types; validated by the RedLeaf IDL compiler.
- Proxies facilitate cross-domain calls: 1) Verify that the domain being called is alive, 2) captures the state at the entry to the new domain (allows rollback on failure), 3) moves ownership from caller to callee, 4) Wraps all trait references passed.

Multiple domains: core kernel, file system, networking, device drivers, user domains.

Overhead of domain isolation: 3-4x faster than seL4; comparable to VMFUNC (one-way), 1-2x faster than 2-way VMFUNC.
Rust overhead: if you write in a "rust-like" way, you get a 25%-ish penalty; if you write "C code in Rust" it's as fast as C. (So clearly I should just keep writing code like I do and adopt Rust syntax :-).
Device Drivers: Used due to their tight performance budget.
- In general, RedLeaf is competitive with DPDK.
- When you run in separate domains, then for single packets, RedLeaf incurs a penalty (as we might expect).
- However, when you do many packets, then all the RedLeaf implementations are pretty much comparable to DPDK.
Applications:
- Maglev load balancer: Way better than Linux; 20-30% slower than DPDK.
- Network attached kv-store: Performance degradation between 15-40% relative to a C application using DPDK.
- Web server: Kicks butt!