Operating Systems Depth Exam

	UNIVERSITY OF WISCONSIN-MADISON Computer Sciences Department
Operating Systems Depth Exam		Fall 1997

Instructions: There are seven questions on this exam; answer any six of the following seven questions. This exam has three (3) pages.

Question 1. File systems:

The original Unix file system was very simple. Each file was represented by a an entry in an inode array that contained information about the file and was the root of a tree of disk blocks, whose leaves were the data blocks of the file. Blocks not in any file were linked into a free list. Unfortunately, performance was poor, especially for single-user applications. Why was performance poor? Why was performance less of a problem for multi-user (timesharing) applications?
The Berkeley Fast File System (FFS) made two major changes to the data structures. It replaced the free list with a bit map, and it replaced fixed-sized blocks with blocks and fragments. Briefly describe how the FFS took advantage of each of these changes to improve file-system throughput. Before the changes, write-bound applications were faster than read-bound applications. After the changes, the reverse was true. Why?
The inventors of the Log-structured File System (LFS) argue that the advent of large main memories makes it more important to optimize the speed of write-bound appliations. What is their argument? Give the main ideas of their algorithm for improving the performance of applications that write large files sequentially.

Question 2. Capabilities:

For a capability-based protection architecture to work, there must be some mechanism to prevent uncontrolled modification of capabilities.

Two "traditional" mechanisms for protection of capabilities are capability segments and tagged architectures. Discuss the relative merits of these two approaches.
More recently distributed capability-based systems have been built using cryptographic techniques. Why is this approach preferable to the "traditional" methods in a distributed setting? Briefly outline how this method might work.
Some capability systems (e.g., Hydra) have a fixed principal for resources of a given type (and their capabilities). Other systems (e.g., Mach) have a concept of ownership, a right associated with a resource that may be transferred like most other rights. The principal or owner for a resource has special rights not generally available. In both types of systems, there must be exactly one principal or owner for every resource at all times. Describe the tradeoffs between having resource ownership be a fixed property of a principal, or making it a transferable right.

Question 3. Remote Execution Security:

In the Condor system, I can run my programs remotely on another computer. My program is linked with a special Condor version of the C library; when my program runs, all system calls (such as read() or write()) made the program are sent back my home (desktop) machine and executed there.

We want to prevent my program from abusing the remote computer. Describe three things that my program might try to do that would violate the security of the remote computer, and describe a strategy for preventing this abuse.
We also want to prevent the owner of the remote machine (who has root/superuser priviledges) from using my remote program as an inappropriate path to access the files or resources on my desktop computer. Describe one way in which the owner could make such an access, and describe how you might prevent such an access.

Question 4. Global Clocks:

Why are clocks difficult to synchronize in distributed systems? What are the factors limiting the accuracy of the sychronization?
What features in a system assume that the clocks are (reasonably) synchronized? Give two examples of cases where a problem would occur if the clocks were not synchronized.

Question 5. Synchronization:

Java provides a form of monitor-based sychronization with condition variables. These monitors provide a "synchronized" key word for class functions (methods) that provides a mutual exclusion lock with all other synchronized methods. Each object (class instance) is a separate monitor, so there is a monitor lock for each object.

The interesting twist is that a Java monitor instance can have only one condition variable.

Provide a monitor-based solution to the bounded-buffer (producer/consumer) problem using monitors that allows only a single condition variable in each monitor. You can use Java syntax or C++-like syntax with monitors added.
Given the above language description, how can you simulate the effect of declaring multiple condition variables?

Question 6. Authentication:

A common facility needed in computer networks is means for one party to set up a secure, authenticated communications channel with another.

The Needham and Schroeder key distribution algorithm allows a process to get credentials from a Key Distribution Center, which it can then use to establish a connection with another process on the network. Briefly describe how this algorithm works.
The Kerberos authentication system is based on the Needham/Schroeder scheme. An additional feature of Kerberos is that the credentials contain IP network addresses, and part of the verification of credentials involves checking that packets come from where they claim to come from. One drawback to this design decision is that it prevents Kerberos "tickets" from being used as capabilities. Explain why.
Consider an authentication system based only on IP addresses. What are the limitations of such a system? Specifically, what elements of the functionality provided by the system of part (a) would be missing in such a system? How could a cracker compromise the security of such a system?

Question 7. Caches and Paging:

Hardware caches are used extensively to improve memory system performance. To find out whether a particular data item is in cache, the processor probes the cache with the address of the data item. The address can be either the virtual address of the data item, or the physical address of the data item.

Explain how the relationship between the TLB and first-level is different in these two cases: (1) the first-level cache is virtually indexed (i.e. the address used to probe cache is virtual address) and (2) the first-level cache is physically indexed (i.e. the address used to probe cache is physical address).
In most modern operating systems, there is a page-mapping algorithm inside the virtual memory system, which chooses the physical page to which a virtual page is mapped at page fault time. What potential performance effects does this page-mapping algorithm have on physically-indexed and virtual-indexed caches?
Suppose an application makes memory references to the following virtual memory locations in the following order:
```
  1024, 2048, 1152 (=1024+128), 2176 (=2048+128),
  1280 (=1152+128), 2304 (=2176+128), ...
```
Suppose the OS maps virtual page 1024 to physical page 100, and maps virtual page 2048 to physical page 228, and the page size is 4K. The cache is 512K bytes with 64-byte line size, physically-indexed and direct-mapped. (That is, each hash value maps to a separate 64-byte chunk of the cache.)
What performance problem does the OS' choice of page mapping introduce?
The performance problem of part (3) has motivated studies on how to map virtual pages to physical pages to avoid it. Based on what you know about typical memory access patterns in applications (for example, spatial locality, relative frequency of accesses to stack, code and data), describe two algorithms that map virtual pages to physically pages carefully to avoid cache misses.