Postscript available at URLs:

	ftp://ftp.cs.wisc.edu/markhill/Theses/richard_kessler_toc.ps
	ftp://ftp.cs.wisc.edu/markhill/Theses/richard_kessler_body.ps


%T Analysis of Multi-Megabyte Secondary CPU Cache Memories
%A Richard Eugene Kessler
%D July 1991
%I Univ. of Wisconsin
%R Computer Sciences Technical Report #1032
%K Kessler Thesis
%Y
%Y Abstract:
%Y
%Y This dissertation investigates multi-megabyte secondary caches.  In a
%Y multi-level cache hierarchy, secondary caches service processor memory
%Y references that cannot be serviced by smaller and faster primary
%Y caches.  With faster processors and expanding main memories, a
%Y multi-megabyte cache is increasingly vital because it shields processor
%Y memory references from costly main memory accesses, even when the
%Y processor references a large address space.  Multi-megabyte secondary
%Y caches allow processors to execute at the speeds they are capable of,
%Y even when there is a large processor-to-main-memory speed gap.
%Y
%Y This analysis uses a new collection of memory address traces that is
%Y appropriate for multi-megabyte cache simulation. These traces
%Y thoroughly characterize several large workloads, and are long enough
%Y (billions of instructions) to overcome multi-megabyte cache
%Y cold-start.  This dissertation includes the first comparison of two
%Y previously-proposed trace-sampling techniques that can reduce
%Y long-trace simulation requirements:  SET SAMPLING and TIME SAMPLING.
%Y Under a range of conditions, set sampling produces more accurate cache
%Y performance estimates with less trace data than time sampling.
%Y
%Y This dissertation examines many alternative cache designs.  It shows
%Y that multi-megabyte secondary caches are extremely useful with large
%Y processor-to-main-memory speed gaps.  Furthermore, associativity is
%Y needed for smaller secondary caches, but may not be necessary in
%Y multi-megabyte caches; multi-megabyte cache block sizes should be
%Y larger than for smaller caches, and the block size that equals the
%Y fixed latency and transfer time miss penalty components is a good
%Y design point.
%Y
%Y Finally, this dissertation introduces and solves the problems caused by
%Y the interaction of virtual memory and real-indexed multi-megabyte
%Y caches.  Since the placement of pages in main memory also places data
%Y in the cache, a poor page placement will cause poor cache performance.
%Y This dissertation introduces several new CAREFUL PAGE MAPPING
%Y algorithms to improve the page placement, and shows that they eliminate
%Y 10%-20% of the direct-mapped real-indexed cache misses for the long
%Y traces.  In other words, this dissertation develops software techniques
%Y that can make a hardware direct-mapped cache appear about 50% larger.
%Y