Question 1. File Systems

File systems are often designed with specific workloads in mind. In this question, we explore some traditional file systems, and discuss what their strengths and weaknesses are with regards to various types of workloads.

a) Consider the Berkeley Fast File System (FFS). How does FFS allocate files on disk? What types of workloads work well on FFS?

b) Consider the Log-Structured File System (LFS). How does LFS allocate files on disk? What types of workloads work well on LFS?

c) Now consider both FFS and LFS. Describe a workload that does not work well on either system.

d) Finally, consider NFS, What types of workloads work well on NFS?

Question 2. Memory Management

Memory management systems and their intricacies can have a strong impact upon system performance. In this question, we explore how memory management should evolve on some interesting hardware configurations. In particular, we focus on hardware systems that support "large" page sizes; assume in this case we have a machine that in addition to a standard 4 KB page allows for large 1 MB pages.

a) Describe the motivation for large pages. Why should hardware systems provide this kind of support?

b) Let's say we want to provide an application interface to allow applications to ask for large pages. How does an OS typically hand out memory, and how should this API be enhanced to include the ability to request small or large pages?

c) Now let's say we want to get the benefits of large pages without changing applications or the memory-request interface. Describe how an OS could transparently use large pages, and how it would do so.

Question 3. Dynamic linking in Multics

Dynamic linking as implemented in Multics, is an elegant and powerful concept that allows multiple processes to share the same code and data segments.

a) Two of the requirements for dynamic linking in Multics are: 1) procedure segments are pure (i.e., their execution does not modify their content) and 2) a process can call a routine with its symbolic name without prior arrangements for its use. Why are each of these two requirements necessary? What are the implications of each requirement?

b) Briefly describe the steps involved with dynamic linking in Multics. To be concrete, consider the case where a process P, currently executing within a procedure in segment R calls a procedure in a segment, Q, for the first time. Please draw a supporting diagram with the relevant data structures.

Question 4. Quality of Service in Wide Area Networks

Many Internet applications demand a level of service that may be difficult to provide in the traditional IP service model. A variety of architectures have been proposed over the years to accommodate these requirements.

a) Name and give a brief description of the five distinct components of any network architecture designed to support quality of service (QoS).

b) The notion of resource (eg. bandwidth) reservation embodied in a protocol like RSVP is at odds with the basic IP service model. What are the implications of this conflict and how it could be managed?

c) What features of RSVP enable it to scale to large numbers of users?

d) One way to provide QoS to Internet applications is through the use of application layer infrastructures (ie. peer-to-peer systems). One example of these is application layer multicast. Compare and contrast the potential for supporting QoS in an application level system versus RSVP.

Question 5. Clocks and Makefile

Consider a "makefile" that is used to build a software package from a variety of source files.

For this package, the directory containing the source files (.c and .h) resides on a different file server than the directory that contains the object files (.o).

For this question, assume that each process is running on its own host in a cluster of computers connected by a high-speed LAN.

a) What can go wrong if the clocks on the servers are not synchronized. Give an example.

b) Even the various client and server hosts are using a clock synchronization protocol, the clocks will not be exactly synchronized. Explain why this situation would occur.

c) Even though clocks are not exactly synchronized, in practice, they seem to work pretty well even for directories distributed across different servers. How much synchronization is really needed to allow "make" to work well? What are the costs of providing increasingly better clock synchronization?

Question 6. Synchronization

One way to enforce mutual exclusion is through non-preemption: the semantics of the language and its runtime support guarantee that one process will not be preempted by another, and the processor is only switched to another process when the running process blocks on some system call. Discuss the strengths and weaknesses of this approach as compared with the alternative: any process may be preempted at any time by any other unless it takes special precautions (such as setting a lock).

Question 7. Memory-Mapped File Access:

Various operating systems have offered memory-mapped access to files, either as the fundamental way to access files or as an additional feature. Consider the Unix mmap system call that has the form (somewhat simplified):

long addr = mmap (long len, int prot, int filedes, long offset)

where data is being mapped from the currently open file filedes starting at the position in the file described by offset; len is the size of the part of the file that is being mapped into the process' address space; prot describes whether the mapped part of the is readable or writable. The return value, addr, is the address in the process' address space of the mapped file region.

a) When a process performs an mmap, what logical operations must happen inside the operating system?

b) When the process subsequently references a memory location that has been mapped from the file for the first time, what operations happen in the operating system? What happens upon subsequent accesses?

c) Assume that you have a program that will read sequentially through a very large file, computing some summary operation on all the bytes in the file. Compare the efficiency of performing this task when you are using conventional read system calls versus using mmap.

d) Assume that you have a program that will read randomly from a very large file. Compare the efficiency of performing this task when you are using conventional read and lseek system calls versus using mmap.