farhan mohamed ali (w2-1) jigar vora (w2-2) sonali kapoor (w2-3) avni jhunjhunwala (w2-4)...

21
Farhan Mohamed Ali (W2- 1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 4 MAD MAC 525 15 th February, 2006 Gate Level Design W2 Project Objective: Design a crucial part of a GPU called the Multiply Accumulate Unit (MAC) which will revolutionize graphics. Design Manager: Zack Menegakis

Post on 20-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Farhan Mohamed Ali (W2-1)Jigar Vora (W2-2)Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4)

Presentation 4

MAD MAC 525

15th February, 2006Gate Level Design

W2

Project Objective:Design a crucial part of a GPU called the Multiply Accumulate Unit (MAC) which will revolutionize graphics.

Design Manager: Zack Menegakis

MAD MAC 525 Status:Project chosen

Specifications defined

ArchitectureDesignBehavioral VerilogTestbenchesVerilog : Gate Level DesignFloor plan (revised & Updated)Schematic (adder)

To be doneLayout Extraction, LVS, post-layout simulation

• Multiply Add (MAD) / Multiply Accumulate Unit (MAC)

• Executes function AB+C on 16 bit floating point inputs

• Multiply and add in parallel to greatly speed up operation

• Rounding is only performed only once so greater accuracy than individual multiply and add functions.

• MAD MAC accelerates FP16 blending to enable true HDR graphics

• Bright things can be really bright• Dark things can be really dark• And the details can be seen in both

Recap - MAD MAC 525

Design Decisions• Using n pass shifters instead of regular gates for

the muxes– Increases speed --- – Reduces transistor count– Reduces area– Complexity of the project remains the same

Block Diagram

RegArray A RegArray B RegArray C

Multiplier Exp Calc Align

Adder/SubtractorControlLogic

&Sign

Dtrmin

Normalize

Round

Reg Y

Leading 0 Anticipator

10 10 10

5

55

1435225

4

36

14

101

5

5

Input Input Input

Output

16 16 16

16

Updated Estimated Transistor Count

n-pass gates• Registers (I/O, pipelining, threading) 1800 1800• Carry-Save Multiplier 3500 3500• Carry-Select Adder/Subtractor 3700 3700• Alignment Shifter 530 1500• Leading 0 Anticipator 350 350• Normalize 900 3400• Rounding 300 300• Exponents 700 700

• Total 11780 15250

Estimated area (in um sq) n-pass

Registers 9000 Multiplier 25000 Adder 26500 Align 3800 Leading zero counter 2500 Normalize 6500 Round 2000 Exponent calc 5000

Total 80300

Multiplier

Align C

Reg A

Reg

BExpCalc

Reg C

Pipeline Reg Pipeline Reg

AdderLd

Zero

Pipeline Reg

NormalizeRound

Reg Y

Main Floorplan

• Multiplier

And Array

726 transistors

FullAdderArray

2640 transistors

InputFromReg

A

10

10

Input from Reg B

InputTo

Adder

22

Schematics• Multiplier: 11 x 11 Carry-Save Multiplier

Schematics• Leading Zero Counter: Carry-Save Adder to count the leading

zeroes of C

Schematics• Align Exponents: N-pass shifter

Schematics• I bit N-pass shifter used in the align block

Schematics• Normalize: n-Pass Shifter to shift the result of the adder by the amount

given by the Leading Zero Counter

• Shifter for the Normalize

Schematics• Round: Incrementer and Shifter

Pipeline Reg

Pipeline Reg

Pipeline Reg

Critical Path

RegArray A RegArray B RegArray C

Multiplier Exp Calc Align

Adder/SubtractorControlLogic

&Sign

Dtrmin

Normalize

Round

Reg Y

Leading 0 Anticipator

10 10 10

5

55

1435

22 5

4

36

14

101

5

5

Input Input Input16 16 16

16

Questions?