1 advanced digital design asynchronous design: fsl by a. steininger and m. delvai vienna university...
Post on 20-Jan-2016
217 Views
Preview:
TRANSCRIPT
1
Advanced Digital DesignAdvanced Digital DesignAsynchronous Design: FSLAsynchronous Design: FSL
by A. Steininger and M. DelvaiVienna University of Technology
2
OutlineOutline Introduction Introduction
PrinciplesPrinciples Basic gatesBasic gates Design flow and toolsDesign flow and tools
Circuit design with FSLCircuit design with FSL PipelinePipeline Data pathsData paths
Current status Current status ConclusionConclusion
3
Fundamental Design ProblemFundamental Design Problem
Ensure a lossles data flow
A
SRC SNK f(x) f(x)
Issue Condition Capture Condition
only valid and consistent data has to be consumed
new data can be issued, when the previous one was
already consumed
4
Synchronous ApproachSynchronous Approach
SRC SNK f(x) f(x)
A
TClk
Global Time Reference
=> indirect conclusion
5
Asynchronous CircuitsAsynchronous Circuits
SRC SNK f(x) f(x)
local handshake protocol
Acknowledge
Request
6
1x „1“ | 2x „0“
Four State Logic Four State Logic
SRC SNK f(x) f(x)
∆t => delay insensitive
3x „1“ | 2x „0“ 2x „1“ | 1x „0“
=> additional information required
=> SNK must able to recognize when data is valid and consistent
7
FSL encodingFSL encoding
Use 2 codes per logic valueUse 2 codes per logic value
X X.a
X.b
need two-rail coding:
L(0,1)
H(1,0)
l(0,0) h(1,1)
„Low“
„High“
φ0
φ1
φ1
φ0
= L
= H = h
= l
L => H(0,0) => (1,1)(0,0) ? (1,1)
(1,0)
(0,1)
8
Completion Detection Completion Detection
SRC SNKH lLh
H hLl
consistent data
9
FSL GatesFSL Gates
Combinational GatesCombinational Gates AND, OR, INV, …AND, OR, INV, …
10
Phase Transition Phase Transition
f(x) f(x) ?φ0 φ1 φ0 φ1
We have to ensure, that:
unconsistent input vectors are not processed
f(x) is a monotonic function
11
Combinational FunctionsCombinational Functions
Variant 1:
Hazard free impl.
Consistency detectorC
D
Processing of unconsistent inputs inevitable due to internal skew
Variant 2:
each basic gate processes only const. inputs
function of each basic gate is monotonic
f(x)
local intelligence => hardware overhead
12
Consistent in φ1
Combinational GatesCombinational Gates And And
OrOr
InvInv
(MUX)(MUX)
(XOR)(XOR)
……
HL
LL
hl
ll
**
**
**
**
H
L
h
l
HLhlY
E1
E2
Truth Table FSL-ANDTruth Table FSL-AND
FSLAND
E1
E2Y
* keep old value Consistent in φ0
13
E1.a
E1.b
E1
E2.a
E2.b
E2
Y.a
Y.b
YMem
Mem
fa(x)
fb(x)
Challenge: preserve the delay insensitive for implementation
Gate TemplateGate Template
14
H (1,0) h (1,1)HIGH
L (0,1) l (0,0)LOW
1 (a,b)0 (a,b)
InverterInverter
Is the inverter delay insensitive?
rail b
rail a
15
FSL GatesFSL Gates
√ Combinational GatesCombinational Gates√ AND, OR, INV, …AND, OR, INV, …
RegisterRegister
16
Completion Detection Completion Detection
SRC SNKH lLh
H hLl
consistent data
17
Completion Detection Completion Detection
SRC SNKH lLh
H hLl
consistent data Register
LatchCMPD
enable
18
FSL RegisterFSL Register
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
LatchCMPD
LatchCMPDCMPD
additional handshake signals are required
19
FSL RegisterFSL Register
LatchCMPDCMPD
Is the output data already consumed ?
input data output data
handshake signal from the next register required
CTRL
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
20
FSL RegisterFSL Register
LatchCMPDCMPD
input data output data
CTRL
When do we close the latch again ?
when the input data was taken over
21
FSL RegisterFSL Register
LatchCMPDCMPD
input data output data
CTRL
Input data is ready to be consumed when all input signals carry the same phase
Input data is consumed when input and output carry the same phase Output data was consumed
when the output of the next register carry the same phase as the current output data
Only phase detectors are required to generate all handshake signals
=> phase detector
=> phase detector
=> phase detector
22
H (1,1) h (1,0)HIGH
L (0,0) l (0,1)LOW
1 (a,b)0 (a,b)
XOR
‘0‘
XOR
‘1‘
φφ–Detector–Detector
23
Ctrl
Latch
Latch
Latch
Latch
φ in
• input data consistent and valid
• output data already consumed
• input data input data consumed
• freeze the latches again
data in
data out
c-done pass
FSL Register FSL Register
φout
24
FSL GatesFSL Gates
√ Combinational GatesCombinational Gates√ AND, OR, INV, …AND, OR, INV, …
√ RegisterRegister√ Latch Latch
√ φφ–Detector–Detector
MemoryMemory
25
MemoryMemory
Two options: Two options:
Store directly FSL signalsStore directly FSL signals 4 bits per logical value 4 bits per logical value huge overhead but delay insensitive (in theory) huge overhead but delay insensitive (in theory)
Store only logical informationStore only logical information 1 bits per logical value 1 bits per logical value low overhead but not delay insensitive low overhead but not delay insensitive
26
MemoryMemory
Standard
RAMφ-det
CONV
CONV
FSL_Logic STD_Logic FSL_LogicSTD_Logic
27
rail a
Std → FSL
rail b
StdFSL
1HIGH
0LOW
φφ–Converter–Converter
FSL logic Std logic
H (1,1) h (1,0)HIGH
L (0,0) l (0,1)LOW
1 (a,b)0 (a,b) Sig.
XOR
requested φ
Stdrail a
rail b
FSL
FSL → Std
28
FSL GatesFSL Gates
√ Combinational GatesCombinational Gates√ AND, OR, INV, …AND, OR, INV, …
√ RegisterRegister√ Latch Latch √ φφ–Detector–Detector
√ MemoryMemory√ φφ–Converter (–Converter (FSL→Std, Std→FSL)
φφ–Inverter–Inverter
29
rail b
rail a
H (1,0) h (1,1)HIGH
L (0,1) l (0,0)LOW
1 (a,b)0 (a,b)
φφ–Inverter–Inverter
=> simple inversion of rail b
30
FSL GatesFSL Gates√ Combinational GatesCombinational Gates
√ AND, OR, INV, …AND, OR, INV, …
√ RegisterRegister√ Latch Latch √ φφ–Detector–Detector
√ MemoryMemory√ φφ–Converter (–Converter (FSL→Std, Std→FSL)
√ φφ–Inverter–Inverter
31
Design Flow and tools Design Flow and tools
Requirements: Requirements:
Standard tools (Synopsys/Quartus)Standard tools (Synopsys/Quartus)
Modelling on RTL levelModelling on RTL level
Support for simulation and synthesis Support for simulation and synthesis
Target platform FPGATarget platform FPGA
32
Adaptation: VHDLAdaptation: VHDL
Definition of an FSL_logic typeDefinition of an FSL_logic type
Redefinintion of std_1164 packageRedefinintion of std_1164 package
Additional functions Additional functions
φφ_det, _det, φφ_inv, conversion functions _inv, conversion functions
stablestable
=> Modelling FSL circuits on RTL level=> Modelling FSL circuits on RTL level
33
Example: Program CounterExample: Program Counter
stable_signals <= AddrInc&JmpExe&JmpAddr;
pc_next: process begin
stable(stable_signals);if JmpExe = ‘H‘ or JmpExe = ‘l‘ then
AddrNxt <= JmpAddr; else AddrNxt <= AddrInc; end if; end process pc_next;
JmpExe
JmpAddrAddrInc
AddrNxtf(x)
f(x)
34
Adaptation: Synthesis (1)Adaptation: Synthesis (1) FSL Target FSL LibraryFSL Target FSL Library
FSL AND, FSL OR , FSL INV, FSL Register, FSL AND, FSL OR , FSL INV, FSL Register, φφ--
Detector … Detector …
Synthesis with FSL Target LibrarySynthesis with FSL Target Library NetlistNetlist
Package FSL_RailPackage FSL_Rail Definition FSL_rail_logic :Record (a,b) Definition FSL_rail_logic :Record (a,b) FSL AND, FSL OR , FSL INV, FSL Register, FSL AND, FSL OR , FSL INV, FSL Register, φφ--
Detector …Detector … Netlist: Replace FSL with FSL_railNetlist: Replace FSL with FSL_rail
Synthesis with FPGA Target LibrarySynthesis with FPGA Target Library
L &L
H
(0,0) &(0,0)
(1,1)
35
Adapation: Synthesis (2)Adapation: Synthesis (2)
conventionaldesign flow
FSLdesign flow
36
Adaptation: SimulationAdaptation: Simulation
Same testbench for FSL_logic and Same testbench for FSL_logic and FSL_rail_logic ciruits FSL_rail_logic ciruits
=> Verification of FSL circuits=> Verification of FSL circuits
Testbench FS
L R
espo
nse
FSL Logic
FS
L S
tim
uli
FSL Rail LogicC
onve
rsio
n
Con
vers
ion
37
OutlineOutline
√ Introduction Introduction √ PrinciplesPrinciples√ Basic gatesBasic gates√ Design flow and toolsDesign flow and tools
Circuit design Circuit design PipelinePipeline Data PathsData Paths
Current statusCurrent status Conclusion Conclusion
38
(Linear) Pipeline(Linear) Pipeline
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
f(x) f(x)
LATCH
K1 K2 K3 K401 0 10
Full initialized
00 0 00
Empty initialized
39
Bubble Concept (1)Bubble Concept (1)
Progress is possible when a circuit contains at least one bubble
K1 K2 K301 0
K410
K4K3K1 K211 0 10
bubble
identical values
40
Bubble Concept (2)Bubble Concept (2)
Initialization => ensure that the circuit contains at Initialization => ensure that the circuit contains at least one bubbleleast one bubble
More than one bubble => higher processing speed More than one bubble => higher processing speed
K1 K2 K3 K4 K5 K6 K71 0 10 011 0
41
Bubble Concept (3)Bubble Concept (3)
Bubbles can be consumed: Bubbles can be consumed: Slow SRC Slow SRC → empty pipeline→ empty pipeline Slow SNK → full pipeline Slow SNK → full pipeline
K2 K3 K4 K5 K61 0 10 1
SRC SNK1
bubble
42
Non-linear PipelineNon-linear Pipeline
Definition:
A non-linear pipeline is a pipeline which contains at least one feedback or forward path
K1 K2 K3 K4 K5 K6
forward path
feedback path
Consequences: Internal regulation bubble cannot be consumed Potential sources of deadlocks
43
Non-linear Pipeline: Non-linear Pipeline: Forward Path (1)Forward Path (1)
K1 K2 K301 0
K41
1
01φ-inv
0
request
SRC SNK
44
K1 K2 K300
Operation
K411
01
φ-inv
1
1
0
2
1
3
0
4
0
0 1
Non-linear Pipeline: Non-linear Pipeline: Forward Path (2)Forward Path (2)
45
K1 K2 K3
Operation
K4
φ-inv
0
5
0 10
0 1
Non-linear Pipeline: Non-linear Pipeline: Forward Path (2)Forward Path (2)
11
46
Empty Non-linear Pipeline: Empty Non-linear Pipeline: Forward Path (1)Forward Path (1)
K1 K2 K3 K4
φ-inv10
000 0 0 01 0 10
1 0
empty initialized => no phase inverter is required
47
0 0K1 K2 K300
Operation
K41
1
φ1
2
1
3
φ1
4
0
φ10
Empty Non-linear Pipeline: Empty Non-linear Pipeline: Forward Path (2)Forward Path (2)
=> different phase inverter placement for full and empty initialized circuit
48
Conclusion:Conclusion:Feedforward PathsFeedforward Paths
K1 K2 K3 01 0 K41
01 φ-inv
0
Full initialized Empty initialized
• Ensure consistent inputs • Phase inverter placement depends on initialisation
K1 K2 K3 00 K4
0
0 0 0
switching sequence switching sequence
49
K1 K2 K300
Operation
K411
0
1φ-inv
0
Non-linear Pipeline: Non-linear Pipeline: Feedback Path (1)Feedback Path (1)
50
K1 K2 K300
Operation
K411
10φ-inv
1
1
0
2
1
3
0
1 0
Non-linear Pipeline: Non-linear Pipeline: Feedback Path (2)Feedback Path (2)
51
K1 K2 K300
Operation
K4110
Non-linear Pipeline: Non-linear Pipeline: Feedback Path (2)Feedback Path (2)
00
52
K1 K2 K300
Operation
K40
Non-linear Pipeline: Non-linear Pipeline: Feedback Path (2)Feedback Path (2)
001
1 0φ-inv
53
Conclusion:Conclusion:Feedback PathsFeedback Paths
Full initialized Empty initialized
• Ensure inconsistent inputs • Phase inverter placement depends on initialisation
switching sequence switching sequence
K1 K2 K300
K4110
K1 K2 K300
K4000
1 0φ-inv
54
Conceptional DifferenceConceptional DifferenceFeedback and ForwardFeedback and Forward
full
Init: Ensure inconsistent inputs
s. seq.
K1 K2 K3 K4
s. seq.empty full
K1 K2 K3 K4
empty
s. seq.s. seq.
Init: Ensure consistent inputs
Feedback path Forward path
a well defined event sequence !!!
Either both or no input switch before K4 can fire
Only one input switches before K1 can fire
56
““Invalid” Feedback PathInvalid” Feedback Path
K1
φ-Inv • always required• no inversion of the
request signal
Definition: A valid feedback path must contain at least two registers nodes
57
Phase Inverter Placement Phase Inverter Placement Phase inverter are required to avoid deadlocksPhase inverter are required to avoid deadlocks
Placement of phase inverters depends on:Placement of phase inverters depends on:
Topology of the circuit Topology of the circuit
Type and number of components inside valid feedbacks Type and number of components inside valid feedbacks
Dynamic behaviorDynamic behavior
InitializationInitialization
Handshake signals have to be consideredHandshake signals have to be considered
Processing speed depends on initializationProcessing speed depends on initialization
More configurations are possible More configurations are possible
59
Data PathsData Paths
Reg f(x)f(x)
Reg
Reg
DESEL.
NODEf(x)f(x)
f(x)f(x)
f(x)f(x)
f(x)f(x) Reg
f(x)f(x) Reg
Reg
SEL.
NODE
• DEMUX• FORK
• MUX• MERGE
60
Example: Example: Merging Data PathsMerging Data Paths
Reg
f(x)
f(x)
Reg
Reg
MUX
f(x)
f(x)
Ack
DW
1 (0
)
DW
4 (1
)
DW
2 (1
)
DW
3 (0
)
Assumption
• Acknowledge is activated when selected data is available
• Differnent delay for both data paths
DW
1 (0
)
DW
4 (1
)
DW
2 (1
)
DW
3 (0
)
∆1∆1
∆3∆3
DW
1 (0
)D
W1
(0
)
Step 1: In1 selected
DW
1 (0
)
DW
1 (0
)DW
2 (1
)
DW
2 (1
)
DW
2 (1
)
Step 2: In1 selected
Step 3: In2 selected
DW
1 (0
)
In1
In2Ack
DW
2 (1
)
61
Example:Example:Merging Data Paths (2)Merging Data Paths (2)
Depending on the circuit functionality:Depending on the circuit functionality:
a) both inputs have to be consumed in each a) both inputs have to be consumed in each processing step processing step Ensure that the difference in processing speed in Ensure that the difference in processing speed in
all data paths is small enough all data paths is small enough Wait until all input are available (even the Wait until all input are available (even the
unused ones)unused ones)
b) all inputs have to be processed and consumedb) all inputs have to be processed and consumed Insert synchronizer circuits to adjust the phase Insert synchronizer circuits to adjust the phase
encoding of the input signals encoding of the input signals
62
DEMUX ExampleDEMUX Example
f(x)
f(x)
Reg f(x)
f(x)
Reg
Reg
DEMUX
f(x)
f(x)
DW
1 (0
)
DW
4 (1
)
DW
3 (0
)
DW
2 (1
)
Avoid loss of synchronization
• Dummy data approach• Synchronizer circuits
Performance considerations
• Wait on ack. of data paths• Wait only on required ack.
63
OutlineOutline
√ Introduction Introduction √ PrinciplesPrinciples√ Basic gatesBasic gates√ Design flow and toolsDesign flow and tools
√ Circuit design Circuit design √ PipelinePipeline√ Data PathsData Paths
Current status Current status Conclusion Conclusion
64
Asynchronous
ASPEAR :ASPEAR :Asynchronous SPEAR Asynchronous SPEAR
65
semi-automated design flow semi-automated design flow (based on Synopsys) (based on Synopsys)
Our Current StatusOur Current Status
66
semi-automated design flow semi-automated design flow (based on Synopsys) (based on Synopsys)
theoretical investigationstheoretical investigations
Our Current StatusOur Current Status
67
semi-automated design flow semi-automated design flow (based on Synopsys) (based on Synopsys)
theoretical investigationstheoretical investigations
working 16-bit processorworking 16-bit processor (on FPGA platform) (on FPGA platform)
Our Current StatusOur Current Status
APEX 20KC
68
semi-automated design flow semi-automated design flow (based on Synopsys) (based on Synopsys)
working 16-bit processorworking 16-bit processor (on FPGA platform) (on FPGA platform)
investigation of DIinvestigation of DI
experimental robustness assessment:experimental robustness assessment: (fault-injection: synchronous design versus asyn) (fault-injection: synchronous design versus asyn)
Our Current StatusOur Current Status
69
OutlineOutline
√ Introduction Introduction √ PrinciplesPrinciples√ Basic gatesBasic gates√ Design flow and toolsDesign flow and tools
√ Circuit design Circuit design √ PipelinePipeline√ Data PathsData Paths
√ Current status Current status Conclusion Conclusion
70
Conclusion FSL Conclusion FSL
Four State Logic (FSL)Four State Logic (FSL) Delay insensitive logicDelay insensitive logic Two Representation Low/High => Dual rail encodingTwo Representation Low/High => Dual rail encoding Even combinational gate require memory elementsEven combinational gate require memory elements
Circuit design with FSLCircuit design with FSL Pipelines: Linear and non-linearPipelines: Linear and non-linear Data paths: Splitt and merge Data paths: Splitt and merge
FSL based processor (SPEAR) availableFSL based processor (SPEAR) available
top related