Title: A 6.2-GFlops Floating-Point Multiply-Accumulator With Conditional Normalization
Abstract: A pipelined single-precision floating-point multiply-accumulator (FPMAC) featuring a single-cycle accumulate loop using base 32 and internal carry-save arithmetic with delayed addition is described. A combination of algorithmic, logic, and circuit techniques enables multiply-accumulate operations at speeds exceeding 3 GHz with single-cycle throughput. The optimizations allow removal of the costly normalization step from the critical accumulate loop. This logic is conditionally powered down using dynamic sleep transistors on long accumulate operations, saving active and leakage power. In addition, an improved leading-zero anticipator (LZA) and overflow prediction logic applicable to carry-save format is presented. In a 90-nm seven-metal dual-V <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</sub> CMOS process, the 2 mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> custom design contains 230K transistors. The fully functional first silicon achieves 6.2 GFlops of performance while dissipating 1.2 W at 3.1 GHz, 1.3-V supply
Publication Year: 2006
Publication Date: 2006-09-21
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 61
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot