adsd fall2011 06 optimizing area

Upload: rehan-hafiz

Post on 06-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    1/17

    Dr. Rehan Hafiz Lecture # 06

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    2/17

    Course Website for ADSD Fall 2011

    http://lms.nust.edu.pk/

    2

    Lectures: Tuesday @ 5:30-6:20 pm, Friday @ 6:30-7:20 pm

    Contact: By appointment/EmailOffice: VISpro Lab above SEECS Library

    Acknowledgement: Material from the following sources has been consulted/used in theseslides:1. [CIL] Advanced Digital Design with the Verilog HDL, M D. Ciletti2. [SHO] Digital Design of Signal Processing System by Dr Shoab A Khan3. [STV] Advanced FPGA Design, Steve Kilts4. Some slides from : [ECEN 248 Dr Shi]

    http://creativecommons.org/licenses/by-nc-sa/3.0/
  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    3/17

    3

    1 Introduction Outline & Introduction, Initial Assessment of students, Digital design

    methodology & design flow2 Verilog+

    Combinational Logic

    Combinational Logic Review + Verilog Introduction, Combinational Building

    Blocks in Verilog

    3 Verilog + Sequential Logic Sequential Common Structure in Verilog (LFSR /CRC+ Counters + RAMS),

    Sequential Logic in Verilog4 Synthesis in Verilog Synthesis of Blocking/Non-Blocking Statements5 Micro-Architecture

    Design Partitioning + RISC Microprocessor + Micro architecture Document

    6 Optimizing Speed Architecting Speed in Digital System Design: [Throughput, Latency, Timing]7 Optimizing Area Architecting Area in Digital System Design: [Area Optimization]8 FIR Implementation FIR Implementations + Pipelining & Parallelism in Non Recursive DFGs10 CDC Issues Cross-Clock Domain Issues & RESET circuits11 Fixed-Point Arithmetic Arithmetic Operations: Review Fixed Point Representation12 Adders Adders & Fast Adders Multi-Operand Addition13 Multipliers Multiplication , Multiplication by Constants + BOOTH Multipliers13 CORDIC CORDIC (sine, cosine, magnitude, division, etc), CORDIC in HW14 Algorithmic

    Transformations for

    System DesignDFG representation of DSP Algorithms, Iteration Bound

    & Retiming

    15 Algorithmic

    TransformationsUnfolding

    Look ahead transformations16 Project Course Review & Project Presentations

    17 Project Project Presentations

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    4/17

    Optimizing Logic Area

    4

    Basic Idea : REUSE the Logic Resources

    May come at cost of speed/ throughput

    Requires additional control circuitry to implement

    hardware reuse

    Reuse Factor

    LetTsample = nTCLK; n being an integer

    N is the reuse factor !

    Rule

    Ifn = 1, one-to-one mapping is the only option.

    Ifn >1, there are opportunities to save hardware

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    5/17

    Algorithm Mapping

    5

    n=1 n>1 recursive

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    6/17

    Techniques for

    Optimizing Logic Area6

    Time Folding

    Rolling up the pipeline

    Function Multiplexing

    Control based Logic Reuse

    Resource Sharing

    Intelligently minimizing the required components

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    7/17

    Time Folding

    7

    Sharing logic resources that are repeated across

    pipeline stages

    Useful for recursive dataflow

    So how can we Roll-up ???

    STV

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    8/17

    Unfolding (Loop Unrolling) Vs. Folding (ROLLING-UP)

    8

    XPower = 1;

    for (i=0;i < 3; i++)

    XPower = X * XPower;LOOP UNROLLING

    XPower = 1;

    XPower1 = X * 1;

    XPower2 = X * XPower1;

    XPower 3 = X * XPower1;

    ROLLING-UP

    XPower = 1;

    XPower = X * XPower;

    Feedback to 3 times

    [STV]

    T i 1

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    9/17

    Rolling-Up Pipeline

    Example- 8 bit Multiplier9

    A multiplier may be architected with an

    accumulator that adds a shifted version of A

    depending on the bits of B

    No special control signals

    A counter to tell: when to stop the shift and add

    Very compact multiplier but will requires 8

    clocks to complete a multiplication.

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    10/17

    a3 a2 a1 a0 b3 b2 b1 b0

    a3 a2 a1 a0 b3 b2 b1 b0

    a3b0 a2b0 a1b0 a0b0 b3b0 b2b0 b1b0 b0b0

    a3b1 a2b1 a1b1 a0b1 b3b1 b2b1 b1b1 b0b1

    a3b2 a2b2 a1b2 a0b2 b3b2 b2b2 b1b2 b0b2

    a3b3 a2b3 a1b3 a0b0 b3b0 b2b0 b1b0 b0b0

    a0a3 a0a2 a0a1 a0a0 a0b3 a0b2 a0b1 a0b0

    a0a3 a0a2 a0a1 a0a0 a0b3 a0b2 a0b1 a0b0

    a0a3 a0a2 a0a1 a0a0 a0b3 a0b2 a0b1 a0b0

    a0a3 a0a2 a0a1 a0a0 a0b3 a0b2 a0b1 a0b0

    B*B

    2*A*B

    A*A

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    11/17

    Function MultiplexingCONTROL-BASED LOGIC REUSE

    11

    When there is no natural flow/sequence

    Need special control signals

    To determine which elements are input to the

    particular structure.

    ALU is a good example as well

    STV

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    12/17

    12

    STV

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    13/17

    Area Optimized FIR

    13

    Can afford this design only when To = Ts/3

    This is confusingSTV

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    14/17

    Time Multiplexed Single MAC FIR

    14

    Shift Sample Memory ONLY on arrival of a new SampleDuring every cycle compute 1 product

    [REF-Required-to-be-added]

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    15/17

    RESOURCE SHARING15

    higher-level architectural resource sharing

    Can be used whenever there are functional

    blocks that can be used in other areas of the

    design or even in different modules

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    16/17

    RESOURCE SHARING16

    100 MHz Clock 10 n sec

    6ffH = 1791 (d)

    55.8KHz

    55.8KHzAny idea ?

  • 8/3/2019 ADSD Fall2011 06 Optimizing Area

    17/17

    RESOURCE SHARING

    17

    System Timer ?

    55.8KHz