CPSC 508 Final Project (2021)
Project Proposal & Research Plan Due: 5:00 PM February 9, 2021
1st Status Meeting: Early Week of March 1, 2021
2nd Status Meeting: Week of March 16, 2021
In-class Presentations: April 6/8, 2021 Depending on how many
different projects we have, we may need only one of these
First Draft Due: 9:00 PM April 4, 2021
Final Project Due 5:00 PM April 30, 2021
The goal of the final project is to provide the opportunity for you
to conduct systems research. The size of the project can
vary, but thinking of it as a conference paper is probably a good
History of a Paper outlines a paper from initial
submission to final paper.
It includes the original (rejected) submission (extended abstract) and reviews,
another (accepted) submission and reviews,
and the the final paper.
This collection should give you an idea of how to give and respond
to constructive criticism.
It will also give you a sense of what we mean
by "conference paper."
Teams / Project
Teams: You may chose to complete your final project alone or as a team.
However, we strongly encourage the final project to be done in teams of
two graduate students or up to four undergraduate students. If you feel that you have
a project sufficiently large to warrant more people, come talk to us.
Projects overlapping other courses/research/...
Projects may also be undertaken in cooperation with other graduate
courses, but any such project must be approved by the professors
of both courses. Not surprisingly, we expect more depth and work
for a project that is satisfying two class requirements.
Similarly, if you wish to undertake a project related to your own
research, we will permit it, but you must demonstrate how what we've
learned in CPSC 508 influences your work and/or ways in which your research
would have been different had you not been also conducting a project
in CPSC 508.
In other words, your project in CPSC 508 must extend work you
would normally have done in some new and/or different way.
In either way: please come and talk to us if your project will be overlapping
with other courses/research.
For this project, you need to pose a question, design a framework
in which to answer the question, conduct the research, and write
up your experience and results. There will be five deliverables for
- Project Proposal and Research Plan (20%)
Hint: if you hand it in early, we will try to give you feedback within 24 hours.
That will maximize the time you have to work on the project.
We therefore strongly encourage you to try
to hand it in early (unheard of, we know).
The proposal part should begin with a single paragraph fairytale: what is the
story you would like to tell?
Then, more formally or seriously describe your
project. You should clearly motivate and state the research
question you are investigating.
Provide a few sentences of explanation about why you
find this to be an interesting question, why it is important,
and how it qualifies as research.
The research plan is a more comprehensive document. It should
include the following components (the numbers in parentheses are
an indication of an estimate of the number pages that you might
need for the section; is it just an estimate; in practice you should write
exactly as much as you need to write to convey answers to questions we
- Related Work (1): Related work falls in two categories.
Background readings are those that are required for you to
understand the area and problem you are attacking. By the time
you complete your proposal, we expect that you have both identified
and completed (some of) the background reading. You should be able to explain
the background in sufficient detail that any of your classmates would
understand what you are doing and why.
The other category of related work we call contextual work.
This includes prior work upon which you are building (if it
wasn't needed in the background), the work of others who are solving the
same problem you are, but doing so in different ways, work in adjacent
areas that influences what you've done.
The purpose of this part of the related work section is to help a
reader place your work in the research landscape and truly understand
You need not have conducted an exhaustive literature search by
the time that you submit the proposal, but you should know what
work is out there and at a minimum, in what areas you need to
Consider this: You need to know that you are undertaking new research
and not merely repeating assignment 1 -- doing something someone else
has already done.
If you are not familiar with the related work, you have no way of
Understand that this means that by the time you submit your proposal,
you have a concrete idea of both the problem and some ideas about
how yo are tackling the problem.
If you think earlier related work is flawed and want to correct
those flaws, this would be the section
where you describe that. Even if you have not yet read all the
papers you intend to read, this section should include a list of
papers that you plan to read, an indication of other areas in which
you think you might need to find some papers, and at least a couple
of specific comparisons between prior work and what you're doing (you
just need a few sentences for each of these).
- Conclusion Format (1 paragraph): What sorts of conclusions do you
expect to state? Obviously, you do not know what the results are
in advance, but you should know the format of those results. For
example, if you are comparing two systems, you would expect to be
able to say things like, "System A outperforms System B in
these instances and System B outperforms System A in these other
instances." Naturally, after you have done the research, you
should also be able to explain why. You might provide some template
sentences that you expect to have in your conclusion (and/or
- Experimental Setup (1): What experiments will you conduct?
Why? What question is each experiment designed to answer?
What do you hope to learn from each experiment? What measurement
tools will you use? How will you determine if your measurements are
accurate? What tests will you conduct conditionally? (For example,
if we learn X from experiment 1 then we must do A else we'll do B.)
What problems do you expect? The more detail you include in this
section, the more able we will be to give useful feedback and the more
easily you will be able to conduct your research. Think hard about
what the important questions are and how you can answer them, and then
map those questions and answers back to explicit tasks.
- Resources Needed (<1): What equipment, software, tools will you
need? This section is particularly important so that the we can make
sure we have the right stuff available. If you have any doubts about
the availability of the resources you'll need, please speak to one of us
as soon as possible! You do not want to find out in the
middle of November that the hardware and/or software you need is
- Schedule (< 1): (be specific; include dates and milestones).
- Status Meeting
We encourage you to come talk to us about your
project or schedule other meetings with me as the need arises.
At a minimum, we want to meet with you twice before the extended
abstract is due. See the syllabus
These meetings are for your benefit. We expect to answer questions
you may have, ask you questions about what you've done, brainstorm
about what to do next, etc. If you haven't done anything, you will
get little value out of these meetings.
Come with questions We can help you answer.
- First Draft (30%)
This is the version of the paper that we will review at the
"Mock Program Committee."
You should have some of your research completed by this point.
The draft should contain all the parts of the paper, although
it may have preliminary results and may be missing some results.
For any results you do not have, we want you to clearly state that
you do not have results, but we want you to write the results section.
That is, we want you to think through what results you hope to have and
how you'll present them.
This can be super fun, because you can MAKE UP WHATEVER RESULTS YOU WANT.
The reason we do that is because when you get actual results, you then have
something against which to compare them.
When your actual results do not match your predictions, one of two things
Either, your initial intuition was wrong or your system/tests are wrong.
If the former is true, that's really interesting -- your final paper will
then explain what you originally thought and why it was not right.
If the latter is true, you get to do some debugging.
This first submission should contain a complete introduction, background,
description of your research, related work, and (preliminary) conclusions.
You should be able to write significant parts of this immediately
after your project proposal is turned in, so please, please, please
don't write this all the night before it is due.
The better and more complete the first draft, the more valuable the
input that we and your classmates can give you.
- In-class Presentation (10%)
Each group will present a short talk on their research during class.
You should plan on approximately a 10 minute presentation and 2-5 minutes
for questions and answers.
You can think of the in-class presentation as being a short conference
You will not have time to present all the details and subtleties of
your work, but you should be able to motivate the audience and explain
the important results of your work.
After your presentation, your classmates should want to read your
The presentation is a great way to make sure that they understand what
you're trying to do, so that there is no confusion when they
read/review your project.
- Final Report (40%)
The final report is a research paper. we expect that most reports will be
approximately 10 - 15 "Conference pages" including graphs,
tables, diagrams and reference.
You should complete the writing early enough that you have time to reread
your work and critique it with the rigor that you applied to Assignment 1.
State shortcomings in your work.
Discuss follow on projects.
we expect that several of these reports will be suitable for submission to
and we will be happy to work with you to turn them into submissions.
Part of your final report grade will be based upon how well you address
comments raised by the program committee.
Do not ignore my and the reviewer comments!
We suggest some topics below (we may add to this list after it is up on the
web site, so it makes sense to check there if you are stuck for project
You need not pick your final project from this list, but if you decide on a
project not on this list, please check with us before fully committing
to the project. The key characteristics of a project should be:
- The work can reasonably be completed in two months.
- We have access to the required hardware and software.
- The research question has something to do with systems
(I'm willing to give a fair bit of freedom here, but if there are
any questions, please check with me).
- The project is structured in such a way that you can have tangible
results. (No big idea papers probably.)
- You will learn something from undertaking this project.
- Securing Containerized Computation
recent development of Kernel Runtime Security Instrumentation by Google
and looking at security namespacing,
can you propose a solution to allow containers to load Linux Security
Monitors and Berkeley Packet Filters
on a per container (a.k.a namespace basis)?
You may want to look at existing cgroup-ed BPF programs and understand
the necessary requirements to enable such a feature in the eBPF-LSM context.
Can you reproduce the use cases presented
Can you present new interesting use cases?
Contact Thomas Pasquier (email@example.com).
- Execution Grammars
prior work, Thomas Pasquier's group has started to build a tool that
creates a “graph grammar” describing the provenance subgraph corresponding
to a given system call.
Given code snippets or malware binaries, it should be possible to build
a grammar representing their execution and therefore to detect the
subgraph they generate.
How could one build such grammar?
Can this be done at scale (e.g. given a malware library)?
Can you automatically extract a generalization that matches a given
malware family to keep detection high in the case of a new malware variant?
Could this solution be used to remove the need for human expertise when
Contact: Thomas Pasquier
- Building an automated network traffic shape generator tool
The sizes and timing of an application's network packets can reveal
the content of the (encrypted) traffic, such as web pages, video
streams or voip chats. One way to address this problem is to shape
the traffic, i.e., modify the sizes and timing of the application's
packets to make them independent of application secrets. What is
an optimal shaping strategy for an application's traffic that is
sufficient to hide its secrets and yet does not impose significant
bandwidth or latency overheads on the traffic?
In this project, the goal is to build a shape generator that
automatically computes optimal traffic shapes for a vulnerable
application. One way to approach this problem would be to apply
machine learning techniques on network traces to determine optimal
traffic shapes for different classes of application secrets.
Contact: Aastha Mehta
- Building a bluetooth beacon-based epidemic risk mitigation system
The goal is to build a prototype of
There are two
broad goals: (a) Implement the message protocols for encounter
broadcasts, encounter history upload and risk dissemination.
(b) Develop a benchmark suite for empirical measurements on BLE 5.2
protocol. Specifically, develop benchmarks and measure the achievable
throughput, transmission and receive error rates, and battery
Contact: Aastha Mehta
- Specification of Translation Hardware
Hardware vendors came up with various kinds of memory protection
units and translation hardware. For example, segments provide
variable sized, contiguous regions to be translated, MMUs provide
page-based translations for the CPU, or in the case of system MMUs
devices. System software is required to correctly program these.
On a high-level, there is a map from addresses in one address space,
to another address in a different address space, or none if there
is no mapping (note: we do not consider broadcast/multicast memory
addresses at the moment). As an example: the processor's MMU translates
virtual addresses used by the applications to physical addresses used
by the processor hardware. The MMU does not operate on arbitrary
addresses, but on pre-defined ranges (page-sizes), and its configuration
is defined by a multi-level page table.
In this space, we have a couple of possible projects which can be done:
In Liedke's paper on micro-kernel construction you have learned
that uKernels are inherently non-portable (for performance reasons).
Within this aspect, there are a few directions we may want to explore:
- Translation Hardware Formalization
In this we try to formally specify the semantics of translation hardware.
We try to extract common concepts like translation granularities etc,
and refine the Decodingnet model from the abstract level down to the
level of real hardware translation unit. Note, this requires to deal
with formalization in Isabelle/HOL.
- Translation Hardware Spec
Here we try to see if we can specify translation hardware in a
machine readable spec, and generate code which can drive and configure
translation hardware present on a machine (or maybe even generate
an emulated device in a simulator. This in includes identifying possible
abstractions of higher-level functions (e.g., map / unmap) and filing
in the low-level details.
- Generation of Capability Types in Barrelfish
Barrelfish enables user-level processes to savely and securely manage
their own address space using typed capabilities. Different architectures
will have different page-table types, and the type system should
enforce the building of correct-by-construction page-tables.
A higher level operation, such as a map or unmap translates into one or
more capability operations. Barrelfish already has a domain specific
language to define the different capability types. In this project we are
looking at the possibility to extend this langauge (or devise a new one)
to generate the relevant functions to operate on those capability
types (e.g., write an entry in the page table)
Companion Kernel Threads
Processor mode switches become inherently more costly due to the isolation
of the page tables in response to the specter meltdown attacks. Performing
a syscall not only is a processor mode switch, bot an additional context switch
as well. How bad is it?
- Synthesizing uKernels
One of the advantages of uKernels is their "simplicity": they can
be really tiny and provide only a well-defined, small set of
functionality. To what extent could we synthesize a uKernel from
a machine specification? This part is about the ability to do this.
- uKernel Tuning
As Liedtke mentioned, uKernels really have to be tailured towards
the target platform to obtain an optimal implementation. In this
project we are looking at different parameters which may influece
the performance of the kernel such as TLB size, number of cores,
hyperthreading, cache size etc. This project we explore the
question whether this makes a difference today. Could we have
a uKernel sketch, and synthesize/tailore it towards the
characteristics of hte underlying processor model?
Modern processors often come with hyperthreads: two execution contexts
sharing a core. The question is: can we use the hyperthread as a companion
kernel thread and use shared-memory / queue-based communication between
them and asynchronously execute syscalls? Could this be an event-based
mechanism in combination with userl-level threads? To what extent can we
relax the companion association and use a N-M architecture where N user
level threads issue system calls to M handlers in the kernel.
Note: this could lead to an application for "Snap for storage" above.
Process overhead in operating systems
Multi-tenant databases, like MongoDB, often host multiple database tenants on a single machine:
this provides better resource utilization. At the same time, they face the following dilemma:
hosting multiple tenants within a single process has obvious security and safety problems, while
hosting each tenant in a separate process raises performance concerns. Your goal is to evaluate
the overhead of running many processes on a modern operating system. Design an experimental
framework that will enable you to vary how many processes are active vs idle, their total virtual
memory size, their working set size and other parameters. Think about key performance metrics
users might care about: e.g., throughput, response time, etc. Run experiments to determine
how many processes can co-exist on a system before performance begins to degrade. Trace down
the reasons for the overhead to kernel data structures and resource management policies.
Other aspects which might be worth exploring: using containers (e.g., Docker) to run your
database, or using virtual machines which provide stronger isolation but may introduce
Secure Tenant Isolation in a single Process
Assume that in the project above, someone discovered that running multiple tenants in a single
process is beneficial for performance, but comes at the cost of weaker isolation between two
tenants. The goal of this project is to try to isolate tenants within a single process using
modern hardware technologies like the Intel memory protection extensions (MPX) which allow
only accesses with a certain range of memory.
Performance Model for Near-Memory / In Memory Computation / Accelerators
Running algorithms on a general-purpose processor may be suboptimal, either by the lack of
parallelism, inefficiencies, or limited memory bandwidth etc. Accelerators provide a more
efficient way to perform computations. Yet to use the accelerator, the application must
decide beforehand which part of the algorithm to offload to accelerators. Modern compilers
already have certain auto-vectorization capabilities which allow the use of vector instructions
as one form of acceleration. Whether or not there is a performance gain depends on various
factors like the compute capabilities of the accelerator, potential memory copy overhead etc.
Can we come up with a runtime system which automatically decides which parts of the application
are executed on the CPU and which parts are executed by the accelerator? For this we need
two things: 1) a model for the machine, and 2) a way to express the algorithm. (either simulated,
using annotations or using a higher-level language).
In-Memory Database Operator Offload
Database store data and offer a declarative interface for querying it. During the query
execution, databases transform the query (written in SQL) into a query plan consisting
of different types of operators forming a tree. The tree is optimized to have the
smallest possible cost (based on statistics over the data). One thing that the database
engine may do is to select and project tuples early on, which 1) reduces the number of
tuples being processed later, and 2) the size of the tuples being processed. Smart storage
devices may offer the possibility to do operator pushdown to the storage controller that
in turn only returns tuples for which a predicate evaluates to true. In contrast, in-memory
databases keep the base-relations in main memory. This project looks at the possibility to
get operator pushdown working with in-memory databases in the context of processing-in-memory,
where there is a small processor close to DRAM (or alternatively a GPU with GDDR memory
holding the data.)
One Queue to rule them all.
There exist a large amount of different descriptor queues in an operating
system. For example, to send a network packet the driver needs to format
an entry in a descriptor queue and tell the network card where to find the
newly formatted packet. Often, there is a pre-allocated descriptor ring,
making the new element known to the device is done by increasing the head
pointer of the queue. Conceptually, the ownership of the buffer is
transferred from the driver to the device. With CleanQ, there is an approach
to formalize the semantics of ownership transfer in the context of
descriptor queues, with proofs down to the C code for certain queues.
In this project, we look at the possibility to specify the descriptor
queue of the device, and be able to generate code which performs the
queue operations including formatting of the descriptor queues.
Cloud Computing in the Wild
Unikernels  have been as an alternative to containers
(e.g., Docker) as a cloud application deployment solution. In parallel,
the concept of "fog" computing is emerging, where services are being
deployed directly into the Internet of Things (IoT) infrastructure
to improve latency, privacy, and partition resilience (among other things).
This suggests the questions, "Could we allow the dynamic migration
of services from the cloud, to edge-devices and directly into end-devices?"
and "Could this be done while maintaining the use of programming
languages and skills backed by a relatively cheap and abundant workforce?"
Transpilation techniques  combined with a unikernel designed for
extremely low resource consumption  could be a step in this
direction. A preliminary proof of concept demonstrated the possibility
to transform a PHP application into a few MB self-contained virtual
machine image. We want to go beyond that proof of concept and build
a prototype to demonstrate effective service migration in such a
1 - Madhavapeddy, Anil, et al. "Unikernels: Library operating systems
for the cloud." ACM SIGPLAN Notices 48.4 (2013): 461-472.
2 - Zhao, Haiping, et al. "The HipHop compiler for PHP." ACM SIGPLAN Notices.
Vol. 47. No. 10. ACM, 2012.
3 - Bratterud, Alfred, et al. "IncludeOS:
A minimal, resource efficient unikernel for cloud services." 2015
IEEE 7th International Conference on Cloud Computing Technology and
Science (CloudCom). IEEE, 2015.
A General Purpose Isolation Mechanism
One could view a hypervisor as a mechanism for providing units of isolation
whose interface is a machine ISA. Similarly, a conventional operating system
provides units of isolation whose interface is the system call API. This
trend continues: a JVM is a user-level process that provides units of isolation
whose API is Java bytecodes. Web servers adopted a pile of complexity to
support virtual domains, thereby introducing yet another way to provide
isolation." Some browsers also provide units of isolation between each
Given this stack of software, each providing an isolation
mechanism, one might wonder why we have N different mechanisms instead of a
single coherent mechanism. Putting it another way, could all these systems use
a single isolation model/implementation? If so, what would it look like?
The goal of this project would be to design an isolation mechanism that
could be used in this way and evaluate it. (It's possible that you could
build something like this on L4.) You could imagine evaluating this using
a set of examples like the web server and running it in different architectural
configurations: a single web server running virtual domains; a web server per
virtual machine; a web server using the isolation mechanism you provide.
Using Provenance to Solve OS Problems
There are many systems papers of the form, "We wanted to solve
some problem, so we modified the kernel to produce a bunch of data,
and then we used that data to do something." I'd like to
see how many of these projects could be done via a single provenance
CamFlow is a selective whole-system provenance
It also have a lovely front-end display engine.
I would love to see how many special-purpose systems could be replaced by
scripts running over CamFlow data. I could imagine doing this dynamically
over streaming data (using CamQuery) or statically over collected data.
- For example, prefetching files requires that you know what files
are likely to be accessed, before programs actually access them --
PASS captures much of that data. So, see if you can replicate the work in
"An Analytical Approach to File Prefetching (1997 USENIX)"
using PASS. Here are other papers on file prefetching to examine:
- Marginal Cost-Benefit Analysis for Predictive File Prefetching (ACSME 2003)
- Design and Implementation of Predictive File Prefetch ing (USENIX 2002)
Another area where provenance might be useful is in cache replacement
algorithms -- if you knew what you might need again soon, you would keep
it in your cache. Look for papers on caching, such as:
A study of integrated prefetching and caching strategies (Sigmetrics PER 1995).
- Informed prefetching and caching (SOSP 1995)
- Application controlled prefetching and caching (USENIX 2002)
- The Coda file system was designed to help users work in a disconnected
mode. One component of that system was a hoarding mechanism where the
system would try to figure out what files you were going to need
to function while disconnected. It seems that one could exploit provenance
to perform better hoarding. Do it!
Warning: I have a strong vested interest in this project. The upside is that
you are likely to get lots of attention; the downside is that you are likely
to get lots of attention.
Storing whole system provenance on blockchain
We are frequently asked how we maintain the integrity of provenance.
provides a mechanism for shipping provenance out to a
remote site, it’s possible that you could simply store provenance
on the blockchain. However, it’s also possible that performance of
blockchain storage will be too slow. Read the camflow paper and as
much blockchain background as necessary. Can you make this work?
(Consider alternatives to proof-of-work, else this is just a non-starter.)
(There is a very
recent paper that does something like this; I was not
impressed with it, so I'm pretty confident there is something
more interesting to be done.)
Prove that LSM-based provenance capture is guaranteed to detect
a security breach.
This is a two-step process.
First, using the methodology used in this paper,
show that the current LSM interface captures all security-related
flows in the kernel.
Next, given provenance captured at these points, prove (or disprove) that a
security violation must show up as an anomaly in the provenance graph.
Real End-to-end Provenance
Data provenance is metadata that describes how a digital artifact came to
be in its present state.
One problem with existing provenance capture systems is that they capture
only local provenance: provenance from a particular language (e.g.,
from a particular workflow system (e.g.,
However, once you copy files or use multiple langauges, or connect
different programs together in a script, you run the risk of breaking
the provenance chain.
We believe that whole system provenance (e.g.,
CamFlow) could provide the glue that
connects different provenance systems.
Your goal is to demonstrate some application that uses provenance from
multiple different collection sources to do something interesting.
For example, given a shell script that calls both R and Python programs,
can you automatically build a container or VM that precisely and
exactly reproduces the experiment?
Alternately, could you use provenance to build a debugging tool?
If you're interested in this project, come talk to me.
Deriving Boot sequences from Machine Independent specifications
I am really excited about the possibility of generating machine dependent
OS code from machine independent specifications.
I believe that such generation will require a lot of different techniques.
One particularly tricky piece of OS code is the boot sequence.
Your goal, should you choose to accept, is to study at least two different
boot sequences (i.e., from two different processors) and come up with
some techniques for A) Describing the sequence in a machine independent
fashion, and B) Generating implementations from that specification.
I expect that part B will not be particularly elegant first time
around, but that's OK.
I recommend not selecting the x86 as one of your target processors!
In fact, you might select some simple and special purpose processors.
One might get some inspiration from prior work on automatically
generating device drivers.
Generating a HAL for ARMv8
There exist a machine reabdable specification of the ARMv8 architecture.
Based on this architecture, we can build an architectural simulator of
an ARMv8 processor. Can we use this information to generate a hardware
abstraction layer and/or certain boot sequences or configuration sequences
that, for example, switch the execution level, configure architectural
Tiny OS Components for Tiny Processors
Pick some special purpose processor.
Design a tiny operating system that either A) makes it easier to
develop applications for the processor, or B) allows it to seamlessly
communicate with a general purpose processor and OS.
Once you've established the functionality you need for your tiny OS,
design a set of even tinier components that can be assembled to
provide that functionality.
Access Pattern from JIT
Interpreted languages such as Python, or languages with an intermediate bytecode
representation such as Java are not compiled to the target architecture of the
target machine. Instead the runtime starts executing the statements or the bytecode
by emulating it on an abstract machine (e.g. as provided by the JVM). This process
introduces a certain overhead. Features like just-in-time compilation will try to
generate native machinecode of parts of the program which is then executed natively
instead of the interpreted equivalent. In this project we would like to see, if we
can somehow extract certain access patterns, e.g. looping over an array. Can we
identify certain patterns which are good to run on accelerators such as GPUs, or
near-memory processing devices? The idea here is that we can identify a part of
the algorithm, and run that on the GPU for instance.
OpenMP is a standard for parallel programs. It works by inserting pragmas into the
code which direct the compiler to emit additional code to automatically parallelize
a loop, for instance. Instead of executing the loop, the loop body is wrapped into
a function. The compiler then uses the OpenMP support library (e.g., libgomp for GCC)
and passes the wrapper function as a the function to be run by the threads managed
by the runtime. In this project you may want to look at techniques to extract
the access patterns and select a device to run the code on. One loop might be run
on the normal CPUs, while the next might be offloaded to an accelerator or GPU.
This will also incldue a survey of existing offloading techniques in compilers
such as GCC.
Snap for Storage
In Marty's paper on Snap they advocate to use a uKernel approach
for networking. Here we try to adapt the similar approach to storage.
How does the performance look like compared to the traditional
Linux storage stack? What functionality can we achieve using this design?
Where are its drawbacks?
NVM Write cache
Non-volatile memory (NVM) is an emerging memory technology resembling both storage and memory.
Like storage, it does not lose data upon a crash. Like memory (e.g., DRAM), it sits on a memory
bus and is byte-addressable. While NVM has been in research for more than a decade, in April 2019
Intel (in partnership with Micron) released the first commercial product: Optane NVM (also known as 3D Xpoint).
Despite tremendous excitement among researchers and practitioners alike, the jury is still
out as to what is the best use form NVM and how it fits into the storage hierarchy. This project
will answer part of this big question in the context of a specific application: file system cache.
Intel implemented a write cache, allowing to use NVRAM as a cache layer that sits on top of the
block storage device, such as SSD or HDD. The write cache was implemented as part of the Linux
device mapper .
The write cache sits side-by-side the kernel buffer cache (which is in DRAM),
and caches blocks that are being written to the block storage device (flushing them to the
device on the background). The write cache does not cache the blocks that are being read.
Your goal is to evaluate the write cache in order to understand when it helps to improve
performance and when it doesn’t. What could be improved? Would it make sense to have reads
cached as well, and if so when? Choose storage-intensive workloads, such as YCSB or others.
Abstract Device Classes and Concrete Devices
Operating systems have a certain notion of a device. Devices often belong to a certain class.
For instance, there exist many different UART or timer devices which support a common high-level
operation (e.g., write_char, set_time, enable, disable), However the concrete implementation of
these devices differ: e.g., the UART of an ARM-based SoC is different to the one used in an
x86 server, yet they perform the same functionality. Part of this project is to survey existing
device-class representations in operating systems (e.g., Linux, BSD, etc). Are there common
abstractions of certain devices, are those abstractions violated at some places, if so why?
Based on the knowledge obtained from the survey we want to extract the abstract statemachines
the OS developers had of the device. This then may serve as a template to express the
statemachine of a concrete device. From the abstract and concrete representations of a
device, we might be able to generate parts of a device driver.