register transfer methodology ii · methodology ii. rtl hardware design by p. chu chapter 12 2...
TRANSCRIPT
RTL Hardware Design
by P. Chu
Chapter 12 1
Register Transfer
Methodology II
RTL Hardware Design
by P. Chu
Chapter 12 2
Outline
1. Design example: One–shot pulse
generator
2. Design Example: GCD
3. Design Example: UART
4. Design Example: SRAM Interface
Controller
5. Square root approximation circuit
RTL Hardware Design
by P. Chu
Chapter 12 3
1. One–shot pulse generator
• Sequential circuit divided into
– Regular sequential circuit: w/ regular next-state logic
– FSM: w/ random next-state logic
– FSMD: w/ both
• Division for code development; no formal
definition;
• Some design can be coded in different types
• FSMD is most flexible
• One–shot pulse generator as an example
RTL Hardware Design
by P. Chu
Chapter 12 4
• Basic block diagram
RTL Hardware Design
by P. Chu
Chapter 12 5
• Refined block diagram of FSMD
RTL Hardware Design
by P. Chu
Chapter 12 6
• Regular sequential circuit. E.g., mod-10
counter
RTL Hardware Design
by P. Chu
Chapter 12 7
• FSM. E.g., edge-detection circuit
RTL Hardware Design
by P. Chu
Chapter 12 8
• FSMD.
E.g., multiplier
RTL Hardware Design
by P. Chu
Chapter 12 9
• One-shot pulse generator
– I/O: Input: go, stop; Output: pulse
– go is the trigger signal, usually asserted for
only one clock cycle
– During normal operation, assertion of go
activates pulse for 5 clock cycles
– If go is asserted again during this interval, it
will be ignored
– If stop is asserted during this interval, pulse
will be cut short and return to 0
RTL Hardware Design
by P. Chu
Chapter 12 10
• FSM implementation
RTL Hardware Design
by P. Chu
Chapter 12 11
RTL Hardware Design
by P. Chu
Chapter 12 12
RTL Hardware Design
by P. Chu
Chapter 12 13
RTL Hardware Design
by P. Chu
Chapter 12 14
• Regular sequential circuit implementation
– Based on a mod-5 counter
– Use a flag FF to indicate whether counter should be active
– Code difficult to comprehend
RTL Hardware Design
by P. Chu
Chapter 12 15
RTL Hardware Design
by P. Chu
Chapter 12 16
• FSMD Implementation
RTL Hardware Design
by P. Chu
Chapter 12 17
RTL Hardware Design
by P. Chu
Chapter 12 18
RTL Hardware Design
by P. Chu
Chapter 12 19
• Comparison:
– FSMD is most flexible and easy to
comprehend
• What happens to the following
modifications
– The delay extend from 5 cycles to 100 cycles
– The stop signal is only effective for the first 2
delay cycles and will be ignored otherwise
RTL Hardware Design
by P. Chu
Chapter 12 20
• “Programmable” one-shot generator
– The desired width can be programmed.
– The circuit enters the programming mode when both go and stop are asserted
– The desired width shifted in via go in the next
three clock cycles
RTL Hardware Design
by P. Chu
Chapter 12 21
• Can be easily
extended in
ASMD chart
• How about FSM
and regular
sequential circuit?
RTL Hardware Design
by P. Chu
Chapter 12 22
2. GCD circuit
• GCD: Greatest Common Divisor
– E.g, gcd(1, 10)=1, gcd(12,9)=3
• GCD without division:
RTL Hardware Design
by P. Chu
Chapter 12 23
• Pseudo algorithm
RTL Hardware Design
by P. Chu
Chapter 12 24
• Modified pseudo algorithm w/o while loop
RTL Hardware Design
by P. Chu
Chapter 12 25
• ASMD chart
RTL Hardware Design
by P. Chu
Chapter 12 26
• VHDL code
RTL Hardware Design
by P. Chu
Chapter 12 27
RTL Hardware Design
by P. Chu
Chapter 12 28
RTL Hardware Design
by P. Chu
Chapter 12 29
• What is the problem of this code?
• Another observation
RTL Hardware Design
by P. Chu
Chapter 12 30
RTL Hardware Design
by P. Chu
Chapter 12 31
• What is the performance now?
• Can we do better with more hardware
resources
RTL Hardware Design
by P. Chu
Chapter 12 32
3. SRAM Controller
• SRAM = Static Random Access Memory
• To improve density, clock is not used
• Basic Signals
– Oe = Output enable
– Cs = chip select
– We write enable
• Read cycle (when data becomes valid)
– Address or oe-controlled
RTL Hardware Design
by P. Chu
Chapter 12 33
• Basic block diagram
RTL Hardware Design
by P. Chu
Chapter 12 34
• Taa = Address access time– Roughly used as “speed” e.g. “20-ns SRAM”
• Toh = Output hold time– Time data valid after address change
• Trc = Total (min) time of read cycle– Close to Taa for SRAM
READ CYCLE
RTL Hardware Design
by P. Chu
Chapter 12 35
• Tolz = Output enable to low Z time– Time to leave high-Z state after OE enabled
• Toe = Output enable to output valid time– Time data valid after OE
• Tohz = Output to Z time– Time to enter Hi-Z state after OE disabled
RTL Hardware Design
by P. Chu
Chapter 12 36
• Twp = Time that we asserted– Data is latched during this period
• Tas = Address setup time– Time address valid before we
• Tah = Address hold time– Time address valid after we
WRITE CYCLE
RTL Hardware Design
by P. Chu
Chapter 12 37
• Tds = Data setup time– Time data valid before latching edge (we 0 to 1)
• Tdh = Data hold time– Time data valid after latching edge
• Twr = Write cycle time– Time (min) between write operations
WRITE CYCLE
RTL Hardware Design
by P. Chu
Chapter 12 38
• Example Timing Parameters
RTL Hardware Design
by P. Chu
Chapter 12 39
• Takes commands from system
• Executes read/write cycles
• Role of SRAM Controller
RTL Hardware Design
by P. Chu
Chapter 12 40
• Block Diagram
RTL Hardware Design
by P. Chu
Chapter 12 41
• Sketch FSM
– Activities of read/write cycles
• Refine
– Consider actual timings
– Identify overlapping operations
• Control Path of SRAM Controller
RTL Hardware Design
by P. Chu
Chapter 12 42
• 5 State write cycle
RTL Hardware Design
by P. Chu
Chapter 12 43
• Revised 3 State write cycle
RTL Hardware Design
by P. Chu
Chapter 12 44
• Control Path
for Slow
SRAM
RTL Hardware Design
by P. Chu
Chapter 12 45
• Write Timing (if Tclk=25ns)
> 5ns> 20ns
> 70ns
> 35ns > 5 ns
RTL Hardware Design
by P. Chu
Chapter 12 46
• Revised States (Write)
• 5 Cycles
+ idle
= 150 ns
• Margin of at
least 5 ns
RTL Hardware Design
by P. Chu
Chapter 12 47
• Repeat Procedure for Read Cycle> 5ns < 120ns
> 120 ns
TohN/A
RTL Hardware Design
by P. Chu
Chapter 12 48
• Assumed ideal FSMD
– Ignoring: output delay, hold/setup time on Rs2m
• More detailed read timing
Tctrl = OE output
delay or Addr
output delay
Toz = Time to High-
Z after OE goes
high
RTL Hardware Design
by P. Chu
Chapter 12 49
• Setup Time Check
– Minimize Tctrloutput look-ahead
Tctrl = Tcq
– 120ns SRAM
25ns Clock
– Met by most device technologies
RTL Hardware Design
by P. Chu
Chapter 12 50
• Hold Time Check
– Since Tctrl = Tcq ,
also easily met
• Other considerations
– Data bus conflict
– Tds, Tdh
– Because of large margin
Ideal, initial analysis probably valid
RTL Hardware Design
by P. Chu
Chapter 12 51
• Final SRAM
Controller
RTL Hardware Design
by P. Chu
Chapter 12 52
• Slow SRAM controller
Only suitable for sporadic access
• What about
Fast 20-ns
SRAM?
RTL Hardware Design
by P. Chu
Chapter 12 53
– Meets timing, but very narrow margin
– Manual fine-tuning
– If back-to-back, requires cell-level (not RTL) design
• One Clock Cycle Access (+idle)
RTL Hardware Design
by P. Chu
Chapter 12 54
Square root approximation circuit
• A example of data-oriented (computation-intensive) application
• Equation:
• 0.125x and 0.5y corresponds to shift right 3 bits and 1 bit
RTL Hardware Design
by P. Chu
Chapter 12 55
• Pseudo code:
RTL Hardware Design
by P. Chu
Chapter 12 56
• Direct “data-flow” implementation
RTL Hardware Design
by P. Chu
Chapter 12 57
RTL Hardware Design
by P. Chu
Chapter 12 58
• Requires one adder and six subtractors
• Code contains only concurrent signal
assignment statements
• The order is not important.
• Sequence of execution is embedded in the
flow of data
RTL Hardware Design
by P. Chu
Chapter 12 59
• Data flow graph
– Shows data dependency
– Node (circle): an operation
– Arcs: input and output
variables
• Note that there is limited
degree of parallelism
– At most two operations can
be perform simultaneously
RTL Hardware Design
by P. Chu
Chapter 12 60
• RT methodology can be used to share the operators
• Tasks in converting a dataflow graph to an ASMD chart
– Scheduling: when a function (circle) can start execution
– Binding: which functional unit is assigned to perform the operation
• In square root algorithm,
– all operations can be performed by a modified addition unit
– No function unit is needed for shifting
RTL Hardware Design
by P. Chu
Chapter 12 61
• Scheduling with two functional units
RTL Hardware Design
by P. Chu
Chapter 12 62
• Scheduling with
one functional unit
RTL Hardware Design
by P. Chu
Chapter 12 63
• ASMD
chart
RTL Hardware Design
by P. Chu
Chapter 12 64
• Registers can be shared as well
– reduce the number of unique variables
– A variable can be reused if its value is no longer
needed
• E.g.,
RTL Hardware Design
by P. Chu
Chapter 12 65
RTL Hardware Design
by P. Chu
Chapter 12 66
• VHDL code
– Needs to manually code the data path to
ensure sharing of functional units
– One unit for abs and min
– One unit for abs, min, - and +
– Can be implemented by using an
adder/subtractor with special input and output
routing circuits
RTL Hardware Design
by P. Chu
Chapter 12 67
RTL Hardware Design
by P. Chu
Chapter 12 68
RTL Hardware Design
by P. Chu
Chapter 12 69
RTL Hardware Design
by P. Chu
Chapter 12 70
RTL Hardware Design
by P. Chu
Chapter 12 71
High-level synthesis
• Convert a “dataflow code” into ASMD based code (RTL code).
– RTL code can be optimized for performance (min # clock cycles), area (min # functional units) etc.
– Perform scheduling, binding
– Minimize # registers and muxes
• Mainly for computation intensive applications (e.g., DSP)
RTL Hardware Design
by P. Chu
Chapter 12 72
Summary
Design examples:
Pulse Generator, GCD, SRAM, Sqrt. Approx.
RT Design Methodology
Flexible, often clearer than pure FSM
Important for initial algorithm to be efficient
Mapping Dataflow to FSMD
Scheduling/binding
Tradeoff of speed/area
High-level Synth for complex algorithms