CS 161: Operating Systems

Tuesday/Thursday 1:00-2:30

Pierce 301

Home

Syllabus

Assignments

Resources

Piazza

Assignment 1: Synchronization

Due Friday at 5:00 PM February 13, 2015:

Objectives

Understand how OS/161 implements semaphores.
Develop and implement synchronization primitives.
Conceive of and implement unit tests for said primitives.
Be able to select the appropriate primitive(s) for a given specification.
Use synchronization primitives appropriately to solve synchronization problems.

Introduction

In this assignment you will implement synchronization primitives for OS/161 and learn how to write tests for them. Then you will use your synchronization primitives to solve a few synchronization problems. Once you have completed the written and programming exercises, you should have a fairly solid grasp of the pitfalls of concurrent programming and, more importantly, how to avoid those pitfalls in the code you will write later this semester.

To complete this assignment you'll need to become familiar with the OS/161 thread code. The thread system provides interrupts, control functions, and semaphores. You will implement locks and condition variables.

For this assignment, you will hand in an individual problem set. You may speak to other students about the probems, but you may communicate only in a spoken language, not in C or any other programming language. You may not search the web for solutions. All the work you turn in should be your own.

Write readable code!

In your programming assignments, you are expected to write well-documented, readable code. There are a variety of reasons to strive for clear and readable code. Since in the future you will be working in pairs, it will be important for you to be able to read your partner's code. Also, since you will be working on OS/161 for the entire semester, you may need to read and understand code that you wrote several months earlier. Finally, clear, well-commented code makes your TFs happy!

There is no single right way to organize and document your code; it is not our intent to dictate a particular coding style for this class. The best way to learn about writing readable code is simply to read a lot of code. Read the OS/161 code, or the source of code of some other freely available operating system. Read your partner's code, too (once you start coding together; that is, from ASST2 onwards). When you read someone else's code, note what you like and what you don't like. Pay close attention to the lines of comments which most clearly and efficiently explain what is going on. When you write code yourself, keep these observations in mind.

Here are some general tips for writing better code:

Group related items together, whether they are variable declarations, lines of code, or functions.
Use descriptive names for variables and procedures. Be consistent with this throughout the program.
Comments should describe the programmer's intent, not the actual mechanics of the code. A comment which says "Find a free disk block" is much more informative than one that says "Find first non-zero element of array."

Setting up git

Before we dive into the code, let's set up git. First, run the following two commands:

# Run this once. If it says you already have a remote named
# "handout", you're good to go.
git remote add handout git@code.seas.harvard.edu:cs161/os161.git

# You can run this as many times as you like.
git fetch handout
From code.seas.harvard.edu:cs161/os161
 * [new branch]      class23   -> handout/class23
 * [new branch]      master    -> handout/master

What we're doing here is adding a nickname for the CS161 staff repository ("handout"), and fetching the CS161 handout code. You already have another remote (a nickname) called "origin," which was set up when you originally typed git clone. Let's check to make sure this worked:

$ git remote -v
handout git@code.seas.harvard.edu:cs161/os161.git (fetch)
handout git@code.seas.harvard.edu:cs161/os161.git (push)
origin  git@code.seas.harvard.edu:~jharvard/cs161/jharvard-os161.git (fetch)
origin  git@code.seas.harvard.edu:~jharvard/cs161/jharvard-os161.git (push)

$ git branch -r
  handout/class23
  handout/master
  origin/HEAD -> origin/master
  origin/master

If your output looks similar to the output above, you're probably good to go!

Begin Your Assignment

First, "tag" your Git repository. The purpose of tagging your repository is to make sure that you have something against which to compare your final tree.

Make sure that you do not have any uncommitted updates in your repo. Now, tag your repository as shown here:

  % git tag asst1-start

Now, push that new tag:

 % git push origin master --tags

Configure OS/161 for ASST1

We have provided an example framework in which you can implement and test your solutions for ASST1. This framework consists of driver code, found in kern/synchprobs, and menu items that you can use to execute tests from the OS/161 kernel boot menu. You have to reconfigure your kernel before you can use this framework. The procedure for configuring a kernel is the same as in ASST0, except you will use the ASST1 configuration file:

  % cd kern/conf
  % ./config ASST1

You should now see an ASST1 directory in the compile directory.

Building for ASST1

When you built OS/161 for ASST0, you ran bmake from compile/ASST0. In ASST1, you run bmake from (you guessed it) compile/ASST1.

  % cd compile/ASST1
  % bmake depend
  % bmake
  % bmake install

If you are told that the compile/ASST1 directory does not exist, make sure you ran config for ASST1.

Command Line Arguments to OS/161

Your solutions to ASST1 will be tested by running OS/161 with command line arguments that correspond to the menu options in the OS/161 boot menu. Please DO NOT change any existing menu option strings, since we will be using them to automate testing.

"Physical" Memory

In order to execute the tests in this assignment, you may need more than the 512 KB of memory configured into System/161 by default. We suggest that you allocate at least 2MB of RAM to System/161. This configuration option is passed to the mainboard device with the ramsize parameter in your sys161.conf file. Make sure the mainboard device line looks like the following:

  31	mainboard  ramsize=2097152  cpus=2

Note: 2097152 bytes is 2MB. You can of course add even more memory if you feel it is necessary for your testing. Adding more CPUs will probably enhance your testing as well.

Concurrent Programming with OS/161

If your code is properly synchronized, the timing of context switches and the order in which threads run should not change the behavior of your solution. Of course, your threads may print messages in different orders, but you should be able to easily verify that they follow all of the constraints applied to them and that they do not deadlock.

Built-in thread tests

When you booted OS/161 in ASST0, you may have seen the options to run the thread tests (type ? at the menu for a list of commands). The thread test code uses the semaphore synchronization primitive. You should trace the execution of one of these thread tests in GDB to see how the scheduler acts, how threads are created, and what exactly happens in a context switch. You should be able to step through a call to thread_switch() and see exactly where the current thread changes.

Thread test 1 ("tt1" at the prompt or tt1 on the kernel command line) prints the numbers 0 through 7 each time each thread loops. Thread test 2 ("tt2") prints only when each thread starts and exits. The latter is intended to show that the scheduler doesn't cause starvation—the threads should all start together, spin for awhile, and then end together.

Debugging concurrent programs

thread_yield() is automatically called for you at intervals that vary randomly. While this randomness is fairly close to reality, it complicates the process of debugging your concurrent programs.

The random number generator used to vary the time between these thread_yield() calls uses the same seed as the random device in System/161. This means that you can reproduce a specific execution sequence by using a fixed seed for the random number generator. You can pass an explicit seed into random device by editing the "random" line in your sys161.conf file. For example, to set the seed to 1, you would edit the line to look like:

  28 random seed=1

We recommend that while you are writing and debugging your solutions you pick a seed and use it consistently. Once you are done with initial debugging/testing, remember to set the random device back to autoseed. This should allow you to test your solutions under varying conditions and may expose scenarios that you had not anticipated, which is central to effective testing.

Code Reading (10 points)

To implement synchronization primitives, you will have to understand the operation of the threading system in OS/161. It may also help you to look at the provided implementation of semaphores. When you are writing solution code for the synchronization problems it will help if you also understand exactly what the OS/161 scheduler does when it dispatches among threads.

Please answer the following questions and submit them with your assignment. Place the answers in a file called asst1.txt in your submit directory.

Thread Questions

What happens to a thread when it exits (i.e., calls thread_exit())? What about when it sleeps?
What function(s) handle(s) a context switch?
What does it mean for a thread to be in each of the possible thread states?
What does it mean to turn interrupts off? How is this accomplished? Why is it important to turn off interrupts in the thread subsystem code?
What happens when a thread wakes up another thread? How does a sleeping thread get to run again?
Scheduler Questions
What function(s) choose(s) the next thread to run?
How does it (do they) pick the next thread?
What role does the hardware timer play in scheduling? What hardware independent function is called on a timer interrupt?
Synchronization Questions
Describe how wchan_sleep() and wchan_wakeone() are used to implement semaphores.
Why does the lock API in OS/161 provide lock_do_i_hold(), but not lock_get_holder()?

Synchronization Primitives (40 points)

We know: you've been itching to get to the coding. Well, you've finally arrived! You can implement these synchronization primitives in terms of any other primitive(s), but you should put some thought into choosing the right implementation.

Locks (10 Points)

Implement locks for OS/161. The interface for the lock struct is defined in kern/include/synch.h. Stub code is provided in kern/threads/synch.c.

Condition Variables (10 Points)

Implement condition variables with Mesa semantics for OS/161. The interface for the cv struct is also defined in synch.h and stub code is provided in synch.c.

Correctness Criteria (10 points)

In asst1.txt (the file you used for code reading answers), enumerate as many correctness or incorrectness criteria as you can for the primitives you just implemented (locks and condition variables). These criteria should read somewhat like "A properly implemented lock will guarantee..." or "A properly implemented CV will prevent...".

Unit Testing (10 points)

It's important to get in the habit of writing your own tests. The tests we provide, while ideally useful to you, do not necessarily ensure robust coverage and correctness of your code. After all, you know your code better than we do! It's also important to be able to communicate your testing needs (and other coding needs) to others. In this section, you will exercise all of these skills.

In order to verify the correctness of your synchronization primitives, we ask that you write some unit tests for your lock and CV implementations. Unit tests are small, modular pieces of test code intended to verify a single, atomic unit of functionality (hence the name).

Implement at least 4 and no more than 10 unit tests, each of which tests one of the criteria you listed in the "Correctness Criteria" exercise above. Each unit test you write should be runnable from the OS/161 menu that appears when you run the system (the options that are displayed when the user types ? into the prompt).

Normally you would implement unit tests for everything you do, however, unit testing synchronization is tricky and the synchronization problems below should exercise your primitives (though of course solving the synchronization problems after you are reasonably confident your primitives are working will make solving the synchronization problems much easier).

Synchronization Problems (50 points)

Besides being able to develop your own implementations and tests, being able to implement according to another person's specification is also a necessary skill, and one which will be the cornerstone of ASST2. For each of the following synchronization problems, you should write code that provides a solution to the given prompt.

General Guidelines

Your solutions to these problems must NOT:

busy-wait;
use spinlocks directly; or
use wait channels directly (you must use synchronization primitives, not things like wchan_create, wchan_wakeone, etc).

Additionally, you should stay clear of biglock solutions, where you use a single global lock to take care of synchronization for all your threads and objects. Instead, take some time to design your solution so that it uses a finer-grained synchronization scheme, or primitives which are better suited to the particular situation. That said, any of the problems below may have a number of valid and effective solutions—choose whichever one you think is best, and make a case for why you believe it to be so. The problems are described below at a high level; additional details can be found in comments in the driver files.

Your solution source files should contain comments that:

state what the correctness criteria are to solve the problem
explain which synchronization primitive(s) you chose and why you chose them
describe why you are convinced your solution works (e.g., describe your invariants, guarantees).

Synchronization Problem 1: Keebler Elves (25 points)

The Keebler Cookie Factory is staffed by one supervisor and many elves. Each elf must complete some series of tasks before it can leave for the day (implemented in the work function). Whenever an elf completes one of its tasks, it announces what it just did (implemented in the kprintf in the work function). When an elf has completed all of its work the supervisor dismisses the elf by saying "Thanks for your work, Elf N!" where N corresponds to the N-th elf.

At the beginning of the day, the supervisor (a supervisor thread) opens the factory and lets the elves inside (starts their threads). At any given moment, there is a single supervisor and possibly multiple elves working. The supervisor is not allowed to dismiss an elf until that elf has finished working. Your solution CANNOT wait for ALL the elves to finish before starting to dismiss them.

Synchronization Problem 2: Air Balloon (25 points)

After a war erupts in their kingdom, Princess Marigold must help Prince Dandelion (her younger brother) escape from danger. Marigold places Dandelion in a hot air balloon, which is connected to the ground by NROPES ropes -- each rope is connected to a hook on the balloon as well as a stake in the ground. Marigold and Dandelion must work together to sever all of these ropes so that Dandelion can escape. Marigold unties the ropes from the ground stakes while Dandelion unhooks the ropes from the balloon.

Unfortunately, one of Princess Marigold and Prince Dandelion's enemies, Lord FlowerKiller, is also at work. FlowerKiller is rearranging the ropes to thwart Princess Marigold and Prince Dandelion. He will randomly unhook a rope from one stake and move it to another stake. This leads to chaos!

Without Lord FlowerKiller's dastardly, behavior, there would be a simple 1:1 correspondence between balloon_hooks and ground_stakes (each hook in balloon_hooks has exactly one corresponding entry in ground_stakes, and each stake in ground_stakes has exactly one corresponding entry in balloon_hooks). However, while Lord FlowerKiller is around, this perfect 1:1 correspondence may not exist.

As Marigold and Dandelion cut ropes, they must delete mappings, so that they remove all the ropes as efficiently as possible (that is, once Marigold has severed a rope, she wants to communicate that information to Dandelion, so that he can work on severing different ropes). They will each use NTHREADS to sever the ropes and udpate the mappings. Dandelion selects ropes to sever by generating a random balloon_hook index, and Marigold selects ropes by generating a random ground_stake index.

Lord FlowerKiller has only a single thread. He is on the ground, so like Marigold, he selects ropes by their ground_stake index.

Consider this example:
Marigold randomly selects the rope attached to ground_stake 7 to sever. She consults the mapping for ground_stake 7, sees that it is still mapped, and sees that the other end of the rope attaches to balloon_hook 11. To cut the rope, she must free the mappings in both ground_stake 7 and balloon_hook 11. Imagine that Dandelion randomly selects balloon_hook index 11 to delete. He determines that it is still mapped, finds that the corresponding ground_stake index is 7. He will want to free the mappings in balloon_hook 11 and ground_stake 7. It's important that Dandelion and Marigold don't get in each other's way. Worse yet, Lord FlowerKiller might be wreaking havoc with the same ropes. For example, imagine that he decides to swap ground_stake 7 with ground_stake 4 at the same time. Now, all of a sudden, balloon_hook 11 is no longer associated with ground_stake 7 but with ground_stake 4.

Without proper synchronization, Marigold and Dandelion can encounter:

a race condition, where multiple threads attempt to sever the same rope at the same time (e.g., two different Marigold threads attempt to sever the rope attached to ground_stake 7).
a deadlock, where two threads select the same rope, but accessing it from different directions (e.g., Dandelion gets at the rope from balloon_hook 11 while Marigold gets at the rope from ground_stake 7).

Your solution must satisfy these conditions:

Avoid race conditions.
Guarantee no deadlock can occur. Your invariants and comments should provide a convincing proof of this. HINT: This includes ensuring that Lord FlowerKiller's behavior does not cause any race conditions or deadlocks by adding the appropriate synchronization to his thread as well.
HINT: You should insert well-placed thread_yield() calls in your code to convince yourself that your synchronization is working.
When Marigold and Dandelion select ropes to cut, you may choose to ignore a particular choice and generate a new one, however, all mappings must eventually be deleted. HINT: Use this to your advantage to introduce some asymmetry to the problem.
Permit multiple Marigold/Dandelion threads to sever ropes concurrently (no "big lock" solutions)

Submitting Your Assignment

By 5:00 PM on Friday, 2/13/15, you should push the following to your code.seas repo:

The file asst1.txt in the submit/ directory containing answers to the questions and your correctness/incorrectness criteria for the synch primitives.
Unit tests demonstrating that your primitives work properly.
Completed implementations of locks and condition variables.
Completed solutions to synchprobs/*.

See the handout How to Submit Assignments for preparing your assignment.

You do not need to print out anything for this assignment.