Title: High bandwidth cache design for superscalar processors
Abstract: This thesis evaluates the performance of high bandwidth cache organizations that employ multiple cache ports, multiple cycle hit times, and new cache port efficiency enhancements to find the organization that provides the best processor performance. The cache port efficiency enhancements proposed and evaluated are load all, load all wide, keep tags, and line buffer. Execution time measures processor performance using a dynamic superscalar processor running realistic benchmarks that include operating system references. A cache with two cache ports increases processor performance by 25% over a cache with a single cache port and no enhancements. With the addition of line buffer and load all to a single ported cache, the processor achieves 91% of the performance of the same processor containing a cache with two ports at a fraction of the cost.
When the processor is not limited to a single cache port, a large dual-ported multi-cycle pipelined SRAM cache with a line buffer maximizes processor performance. A large pipe-lined cache provides both a low miss rate and a high CPU clock frequency. Dual-porting the cache and using a line buffer provide the bandwidth needed by a dynamic superscalar processor. The line buffer makes the pipelined dual-ported cache the best option by increasing cache port bandwidth and hiding cache latency.
The bus bandwidth consumed between the primary caches, the secondary cache, and main memory is also studied. With realistic caches, bus bandwidth is not a uniprocessor performance bottleneck.
Publication Year: 1998
Publication Date: 1998-01-01
Language: en
Type: book
Access and Citation
Cited By Count: 1
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot