feb. 17, 2011

Feb. 17, 2011• Midterm overview• Real life examples of built chips

– Clock Skew• Arithmetic• Data Centers• Power reduction techniques

– Dynamic Voltage / Frequency Scaling– Clock Throttling– Power Gating– Others?

• Project – 4b adder with Razor recovery

Go Over Problems

• 1c• 2a; 2b• 3c

Crossbar Design

Mirror AdderStick Diagram

A Ci Co Ci A B

The Mirror Adder•The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carry-generation circuitry.

•When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important.

•The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell .

•The transistors connected to Ci are placed closest to the output.

•Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

Transmission Gate Full Adder

Sum Generation

Carry Generation

Manchester Carry Chain

G3Ci,0

P1 P2 P3

C3C2C1C0

Carry-Bypass Adder

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,3Co,2Co,1Co,0Ci ,0

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,2Co,1Co,0Ci,0

Multip

BP=PoP1P2P3

Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.

Also called Carry-Skip

Carry-Bypass Adder (cont.)

Carry Ripple versus Carry Bypass

ripple adder

bypass adder

Carry-Select AdderSetup

"0" Carry Propagation

"1" Carry Propagation

Multiplexer

Sum Generation

Co,k-1 Co,k+3

Carry Vector

Carry Select Adder: Critical Path

Linear Carry Select

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15

S0-3 S4-7 S8-11 S12-15

(5)(6) (7) (8)

(5) (5) (5)(5)

Square Root Carry Select

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0" Carry

"1" Carry

Multiplexer

Sum Generation

Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13

S0-1 S2-4 S5-8 S9-13

(4) (5) (6) (7)

(3) (4) (5) (6)

S14-19

Bit 14-19

Adder Delays - Comparison

LookAhead - Basic Idea

Co k f Ak Bk Co k 1– Gk PkCo k 1–+= =

Look-Ahead: Topology

Co k Gk Pk Gk 1– Pk 1– Co k 2–+ +=

Co k Gk Pk Gk 1– Pk 1– P1 G0 P0Ci 0+ + + +=

Expanding Lookahead equations:

All the way:

Carry Lookahead Trees

Co 0 G0 P0Ci 0+=

Co 1 G1 P1G0 P1P0Ci 0+ +=

Co 2 G2 P2G1 P2P1G0 P+ 2P1P0C i 0+ +=

G2 P2G1+ = P2P1 G0 P0Ci 0+ + G 2:1 P2:1Co 0+=

Can continue building the tree hierarchically.

Power Reduction Techniques

• Stop the clock– Dynamic power reduction

• Power gating– Reduce the leakage

• How fast can you turn something on/off?– Nothing to do sleep

• How can you save power while in operation?– Near-threshold design

Power Gating

Kevin Nowka, IBM

Gate Leakage

Digital ParallelizationY[n] = X[n] + X[n-1]

Input(5bits @ 5GS/s)

clk clk

X[n]X[n-1]

Clk = 5GHz

Analog Signal

(8bits @ 100MHz)

ANALOG DIGITAL

DSP Parallelization Y[n] = X[n] + X[n-1]

X[n]X[n-2]

Y[n-1] = X[n-1] + X[n-2]

CLK = 5GHz

X[n-1]

Y[n-1]+

CLK = 2.5GHz

DSP Parallelization• Clock speed reduced by ½

– Can parallelize further– Increase number of MACs(multiply/accumulates) by 2

• Intuition?– Area goes up by 2– Power decreases (clock rate down

by 2, computations up by 2, but easier timing constraints)– What about clock power?

• Save a little power, but double the area?

Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation

• http://www.eecs.umich.edu/~taustin/papers/MICRO36-Razor.pdf

Project Description

• Minimal: 4b Adder, Implemented with Razor– Simulations into near-threshold domain

• Grad. Student: requires more advanced design– Analog: Opamps built using inverters– Digital: Adiabatic Near-Threshold– Power Gating: add power gating to your design

• Undergrad: extra credit if do any of the above

Problem 1: On-Chip Wires Consume Energy• On-chip wire power does not scale

– Dominated by interconnect capacitance (CVDD2)

ON-CHIP (Status Quo):100 - 300fJ/bit/mm

NOTE: Sub/Near-Threshold doesn’t help this problem!

OUR GOAL: < 5fJ/bit/mm

[DOE, Exascale Workshop]

Data Center Design

• http://www.spectrum.ieee.org/feb09/7327

feb. 17, 2011

Documents

wg city branding in gothenburg 17-18 feb 2011

lec 6 feb...

feb. 17, 2012

feb. 17 edition

lec 6 feb 17, 2011 section 2.5 of text (review of heap) ...

byo 2011 vol 17-01 jan-feb

weblist_pgm 17 feb 2011

circular no 068/2021 dated 17 feb 2021 update 17 feb 2021

shopper marketing for retail net group feb 17, 2011

movimientociudadano.mx · scotiabank inverlat s.a. 003752...

steamboat today, feb. 17, 2011

6-feb-17 13-feb-17 20-feb-17 27-feb-17 au bon pain veggie...

arxiv:1009.0526v2 [hep-th] 17 feb 2011

oriental daily travel info -17 feb 2011

3rd annual openmodelica workshop feb 7, 2011 ·...

oblicon lecture - 17 feb 2011

torii u.s. army garrison japan weekly newspaper, feb. 17,...

the east texan: feb. 17, 2011

nicholas & tan partnership - lim seng sheoh (seng sheoh 17...

feb 17 2011 rossland news