Operating System Support for Safe and Efficient Auxiliary Execution
Jing, Huang (2022)
What kind of paper is this?
- Best of both worlds (a way to have both strong isolation and visibility) for helper
tasks (auxiliary execution).
The Story
- Many applications have background or auxiliary tasks (e.g., tuning, debugging,
reconfiguration, deadlock detection, garbage collection, checkpointing).
- In most cases, these tasks run in the same address space of the application, but this
poses a security vulnerability.
- Running in a separate process provides better isolation, but at the cost of worse
performance and less visibility into the application for which it is providing service.
- Can we have the best of both worlds? Yes! Orbit is a new isolation abstraction
that provides both strong isolation and visibility.
Goals for Orbits
- Strong Isolation (why?)
- Convenient Programming Model
- Automatic State Synchronization
- Controller Alteration
- First-class Entity
Key Challenges
- Allow auxiliary entity to inspect state from the main entity
- Minimize performance overhead of strong isolation
API
- Create like a thread create: orbit_create( ... entry function ...) -- once created,
the orbit can only be invoked via specific orbit execution calls
- Invoke the orbit via : orbit_call, orbit_call_async - for async calls, the main
task state is snapshotted BEFORE the call returns
- Retrieve answer from async orbit: orbit_future_get
State Synchronization
- Data is synced from MAIN thread/process to an ORBIT.
- Data is synced only in orbit areas, which are collections of contiguous virtual pages.
- Orbit areas have the same VA in both the main and orbit.
- So all state that needs to be synchronized must live in an orbit area.
- When an orbit function is called, before the API returns, all pages
the main orbit area are mapped into the orbit and are marked copy on write in
main and no-write in the orbit. (This has to happen on EVERY orbit call.)
Orbit Execution
- Challenges
- A call requires crossing two address spaces.
- Calls can be sync, async, and concurrent (but concurrent just means
they get queued; there is no real concurrency).
- Mechanisms
- Task queue per orbit -- queue entry contains set of marked PTEs
- Each call gets a unique ID.
- Orbits function as single-threaded workers
- orbit_task_return: returns result of last orbit call
- Semaphore indicates whether an orbit has work to do
- Policies
- Calls to orbit processed in Fifo order.
- Check for pending returns; if any exist signal last orbit thread to wait.
- Privileged orbits can modify main program state.
- Only in orbit areas
- What about concurrent updates by main and orbit?
- Control updates via pull_orbit and push_orbit -- orbit authors place them
in code explicitly. The authors of main explicitly pull. (I am not convinced this
actually works.) You can also push function pointers (i.e., to kill a thread).
Optimizations
- Retain orbit mappings after termination; on next call, keep any that have
not changed in main. (I'd really like data that indictes how often this happens.)
- Keep region bitmaps to avoid traversing too many PTEs (but I thought orbit
areas were small?).
- Support choice of COW v COPY (but this assumes that you know a lot about
what is going on and I bet can vary a lot between invocations).
- Introduce delegate structs to deal with the case where we have large
structs and only some fields need to go in orbit area. This is basically
just another level of indirection, so you've complicated every structure
that needs this.
Eval
- Research Questions
- Is orbit general enough to rewrite auxiliary tasks in real applications?
- Can orbit-based tasks provide strong isolation?
- How much overhead does orbit introduce?
- Why did they have to do this under QEMU?
- Microbenchmarks (overhead).
- orbit_create as a function of orbit areas: these results confuse me -- orbit-create
is way faster than fork, but my understanding is that almost all
versions of fork do copy-on-write, so shouldn't these be the same?
"Most modern systems, including Linux, use a form of copy-on-write,
where the pages in the process memory are not copied at the time
of the fork call, but later when the parent or child first writes
to the page. That is, each page starts out as shared, and remains
shared until either process writes to that page; the process that
writes gets a new physical page (with the same virtual address)."
- orbit_call: as expected, time increases almost linearly as a function of orbit area
(it is comparable to the fork call). (How does it compare to a regular function call?)
- Applications (fault isolation)
- Do the application unit tests actually text auxiliary tasks?
- Fault injection: Null pointer dereferences -- main task keeps running and orbit gracefully
restarts.
- Fault injection: over allocation (ditto)
- Fault injection: CPU hog
- None of these fault injection results are surprising as we are in a different
address space. However, are these the kind of auxiliary failures that traditionally
cause app failures? (It would have been nice to see any kind of crude analysis of
bug reports to see if this were the case.)
- Real world bug tests (4): Similarly -- they picked bugs that could be isolated into
orbits (but would you actually move backup selection into an orbit?)
- Applications (performance)
- End to end benchmarks show essentially no difference.
- Delegate objects are a huge win (unsurprisingly).
- Maintaining mapping is also a big win (unsurprisingly).
- Code Changes: quite small (does this include use of delegates??)