datapath-1

Upload: anurag-sahu

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Datapath-1

    1/18

    ECE468 datapath.1 Adapted from VC and UCB

    ECE468Computer Organization and Architecture

    Designing a Single Cycle Datapath

    ECE468 datapath.2 Adapted from VC and UCB

    The Big Picture: Where are We Now?

    The Five Classic Components of a Computer

    Todays Topic: Datapath Design

    Control

    Datapath

    Memory

    Processor

    Input

    Output

  • 7/31/2019 Datapath-1

    2/18

  • 7/31/2019 Datapath-1

    3/18

    ECE468 datapath.5 Adapted from VC and UCB

    The MIPS Subset

    ADD and subtract

    add rd, rs, rt

    sub rd, rs, rt

    OR Immediate:

    ori rt, rs, imm16

    LOAD and STORE

    lw rt, rs, imm16

    sw rt, rs, imm16

    BRANCH:

    beq rs, rt, imm16

    JUMP:

    j target op target address

    02631

    6 bits 26 bits

    op rs rt rd shamt funct061116212631

    6 bits 6 bits5 bits5 bits5 bits5 bits

    op rs rt immediate

    016212631

    6 bits 16 bits5 bits5 bits

    ECE468 datapath.6 Adapted from VC and UCB

    An Abstract View of the Implementation

    Clk

    5

    Rw Ra Rb

    32 32-bit

    Registers

    Rd

    AL

    U

    Clk

    DataIn

    DataOut

    Data

    AddressIdeal

    Data

    Memory

    Instruction

    Instruction Address

    Ideal

    Instruction

    Memory

    ClkPC

    5Rs

    5Rt

    16Imm

    32

    323232

    Operational element vs State element

  • 7/31/2019 Datapath-1

    4/18

    ECE468 datapath.7 Adapted from VC and UCB

    Clocking Methodology

    All storage elements are clocked by the same clock edge

    Edge-trigged: all stored values are updated on a clock edge

    Cycle Time = Latch Prop + Longest Delay Path + Setup + Clock Skew

    (Latch Prop + Shortest Delay Path - Clock Skew) > Hold Time

    Clk

    Dont Care

    Setup Hold

    .

    .

    .

    .

    .

    .

    .

    .

    .

    .

    .

    .

    Setup Hold

    ECE468 datapath.8 Adapted from VC and UCB

    An Abstract View of the Critical Path

    Register file and ideal memory:

    The CLK input is a factor ONLY during write operation

    During read operation, behave as combinational logic:

    - Address valid => Output valid after access time.

    Clk

    5

    Rw Ra Rb32 32-bit

    Registers

    Rd

    ALU

    Clk

    Data

    In

    DataOut

    Data

    Address Ideal

    Data

    Memory

    Instruction

    Instruction Address

    Ideal

    Instruction

    Memory

    ClkPC

    5Rs

    5Rt

    16Imm

    32

    323232

    Critical Path (Load Operation) =

    PCs prop time +

    Instruction Memorys Access Time +

    Register Files Access Time +

    ALU to Perform a 32-bit Add +

    Data Memory Access Time +

    Setup Time for Register File Write +

    Clock Skew

  • 7/31/2019 Datapath-1

    5/18

    ECE468 datapath.9 Adapted from VC and UCB

    The Steps of Designing a Processor

    Instruction Set Architecture => Register Transfer Language

    Register Transfer Language =>

    Datapath components

    Datapath interconnect

    Datapath components => Control signals

    Control signals => Control logic

    ECE468 datapath.10 Adapted from VC and UCB

    RTL: The ADD Instruction

    add rd, rs, rt

    mem[PC] Fetch the instruction from memory

    R[rd]

  • 7/31/2019 Datapath-1

    6/18

    ECE468 datapath.11 Adapted from VC and UCB

    RTL: The Load Instruction

    lw rt, rs, imm16

    mem[PC] Fetch the instruction from memory

    Addr

  • 7/31/2019 Datapath-1

    7/18

    ECE468 datapath.13 Adapted from VC and UCB

    Storage Element: Register

    Register

    Similar to the D Flip Flop except

    - N-bit input and output

    - Write Enable input

    Write Enable:

    - 0: Data Out will not change

    - 1: Data Out will become Data In Clk

    Data In

    Write Enable

    N N

    Data Out

    ECE468 datapath.14 Adapted from VC and UCB

    Storage Element: Register File

    Register File consists of 32 registers:

    Two 32-bit output busses:

    busA and busB

    One 32-bit input bus: busW

    Register is selected by:

    RA selects the register to put on busA

    RB selects the register to put on busB

    RW selects the register to be writtenvia busW when Write Enable is 1

    Clock input (CLK)

    The CLK input is a factor ONLY during write operation During read operation, behaves as a combinational logic block:

    - RA or RB valid => busA or busB valid after access time.

    Clk

    busW

    Write Enable

    32

    32

    busA

    32

    busB

    5 5 5RW RA RB

    32 32-bit

    Registers

  • 7/31/2019 Datapath-1

    8/18

    ECE468 datapath.15 Adapted from VC and UCB

    Storage Element: Idealized Memory

    Memory (idealized)

    One input bus: Data In

    One output bus: Data Out

    Memory word is selected by:

    Address selects the word to put on Data Out

    Write Enable = 1: address selects the memorymemory word to be written via the Data In bus

    Clock input (CLK)

    The CLK input is a factor ONLY during write operation

    During read operation, behaves as a combinational logic block:

    - Address valid => Data Out valid after access time.

    Clk

    Data In

    Write Enable

    32 32

    DataOut

    Address

    ECE468 datapath.16 Adapted from VC and UCB

    Overview of the Instruction Fetch Unit

    The common RTL operations

    Fetch the Instruction: mem[PC]

    Update the program counter:

    - Sequential Code: PC

  • 7/31/2019 Datapath-1

    9/18

    ECE468 datapath.17 Adapted from VC and UCB

    RTL: The ADD Instruction

    add rd, rs, rt

    mem[PC] Fetch the instruction from memory

    R[rd]

  • 7/31/2019 Datapath-1

    10/18

    ECE468 datapath.19 Adapted from VC and UCB

    Datapath for Register-Register Operations

    R[rd]

  • 7/31/2019 Datapath-1

    11/18

    ECE468 datapath.21 Adapted from VC and UCB

    RTL: The OR Immediate Instruction

    ori rt, rs, imm16

    mem[PC] Fetch the instruction from memory

    R[rt]

  • 7/31/2019 Datapath-1

    12/18

    ECE468 datapath.23 Adapted from VC and UCB

    RTL: The Load Instruction

    lw rt, rs, imm16

    mem[PC] Fetch the instruction from memory

    Addr

  • 7/31/2019 Datapath-1

    13/18

    ECE468 datapath.25 Adapted from VC and UCB

    RTL: The Store Instruction

    sw rt, rs, imm16

    mem[PC] Fetch the instruction from memory

    Addr

  • 7/31/2019 Datapath-1

    14/18

    ECE468 datapath.27 Adapted from VC and UCB

    RTL: The Branch Instruction

    beq rs, rt, imm16

    mem[PC] Fetch the instruction from memory

    Cond

  • 7/31/2019 Datapath-1

    15/18

    ECE468 datapath.29 Adapted from VC and UCB

    Binary Arithmetics for the Next Address

    In theory, the PC is a 32-bit byte address into the instruction memory:

    Sequential operation: PC = PC + 4

    Branch operation: PC = PC + 4 + SignExt[Imm16] * 4

    The magic number 4 always comes up because:

    The 32-bit PC is a byte address

    And all our instructions are 4 bytes (32 bits) long

    In other words:

    The 2 LSBs of the 32-bit PC are always zeros

    There is no reason to have hardware to keep the 2 LSBs

    In practice, we can simplify the hardware by using a 30-bit PC: Sequential operation: PC = PC + 1

    Branch operation: PC = PC + 1 + SignExt[Imm16]

    In either case: Instruction Memory Address = PC concat 00

    ECE468 datapath.30 Adapted from VC and UCB

    Next Address Logic: Expensive and Fast Solution

    Using a 30-bit PC: Sequential operation: PC = PC + 1

    Branch operation: PC = PC + 1 + SignExt[Imm16]

    In either case: Instruction Memory Address = PC concat 00

    30

    30

    SignExt

    30

    16imm16

    Mux

    0

    1

    Adder

    1

    PC

    Clk

    Adder

    30

    30

    Branch Zero

    Addr

    Instruction

    Memory

    Addr00

    32

    InstructionInstruction

    30

  • 7/31/2019 Datapath-1

    16/18

    ECE468 datapath.31 Adapted from VC and UCB

    Next Address Logic: Cheap and Slow Solution

    Why is this slow?

    Cannot start the address add until Zero (output of ALU) is valid

    Does it matter that this is slow in the overall scheme of things?

    Probably not here. Critical path is the load operation.

    30

    30SignExt 3016imm16

    Mux

    0

    1

    Adder

    0

    PC

    Clk

    30

    Branch Zero

    Addr

    Instruction

    Memory

    Addr00

    32

    Instruction

    30

    1

    Carry In

    Instruction

    ECE468 datapath.32 Adapted from VC and UCB

    RTL: The Jump Instruction

    j target

    mem[PC] Fetch the instruction from memory

    PC

  • 7/31/2019 Datapath-1

    17/18

    ECE468 datapath.33 Adapted from VC and UCB

    Instruction Fetch Unit

    30

    30

    SignExt

    30

    16imm16

    Mux

    0

    1

    Adder

    1

    PC

    Clk

    Adder

    30

    30

    Branch Zero

    00

    Addr

    Instruction

    Memory

    Addr

    32

    Mux

    1

    0

    26

    4

    PC

    Target30

    j target

    PC

    Imm16RdRsRt

  • 7/31/2019 Datapath-1

    18/18

    ECE468 datapath.35 Adapted from VC and UCB

    Where to get more information?

    To be continued ...