1.metastability
DESCRIPTION
CMOS inverter problems for digital designTRANSCRIPT
Metastability
Cairo University Faculty of EngineeringDepartment of Electronics and Electrical CommunicationsDr. Karim Ossama Abbas2012
Contents Definition of metastability Entering metastability Control signal synchronizers Reliability of synchronization Mesonchronous and multisynchronous systems
Main reference: Metastability and Synchronizers: A Tutorial, Ran Ginosar, IEEE CASS and CS
Metastable point
A metastable point is one that is strictly stable However it is different from truly stable states 1 and 2 The metastable point is extremely sensitive It will inevitably settle on either stable points 1 or 2 if left
alone Noise alone is enough to do this
Metastability in logic As far as inverters are concerned: Metastability is being in the high gain region, particularly close
to the logic threshold The stable states are the low gain regions “1” and “0” logic
regions, below and above VIL and VIH
How do we enter metastability We enter metastability if data and clock are switched
simultaneously or too close together This is equivalent to a violation of setup or hold time This is almost inevitable if we cross clock domains without
additional measures
Metastability inside a flip flop
CLK is 0 and D changes from 0 to 1 Node A is initially at 0 Node A is supposed to charge to 1 We have to give enough time for this to happen (setup time)
Metastability inside a flip flop
CLK changes too soon from 0 to 1 If this happens while A is midway through charging Then B is also midway through discharging A and B drive the inverters into a metastable state
Metastability inside a flip flop
The two inverters WILL exit this metastable state through their positive feedback
However, it will take them too much time to exit
Metastability inside a flip flop
Metastability: When it takes longer than Tcq for data to appear at the output of a latch
Note that many people mistake metastability for wrong or invalid data appearing on output
Waveforms of a metastable latch
The input waveforms are moved closer to the clock by steps of 100ps
The closer the waveform, the longer it takes the output to settle
All must eventually settle
Waveforms of a metastable latch
The input waveforms are moved closer to the clock by steps of 1ps
Again takes longer to settle the closer we are Which direction we settle depends on noise and how close to
inverter threshold we are
Practical situations with metastability Metastability may seem like an easy situation to avoid However it is very common Most commonly seen while crossing clock domains For example consider an FFT circuit passing data to a
baseband demodulator Both use different and unrelated clocks Because the frequencies are independent there will be
periodic instances where data and clock edges are too close together
Setup and hold times are periodically violated for the receiver
Probability of metastability First we need to know how often the latch enters
metastability Assume that for metastability to occur a change in signal
must happen in a certain window around the edge (e.g. Setup+hold times) Tw.
Now let’s assume a uniform distribution of change within the clock period (assuming e.g. Two independent clock domains) Tc
Thus probability of entering metastability is Tw/Tc=TwFc
Rate of metastability D does not change every cycle Assume D changes at a rate Fd < Fc Then rate of metastability is FdTwFc Consider the following example: Fc=1GHz Fd=100MHz (D changes on every tenth cycle) Tw=20ps (typical value)
Rate=2,000,000 times/sec = TWICE per MICROSECOND
What does the metastability rate mean? Metastability is a failure mode We get logic values that we don’t expect We may get the logic values we expect but after a longer than
expected delay Thus the following registers store the wrong values This error propagates and the circuit completely fails If this happens twice per microsecond, then the circuit is
completely useless
Circuit model to exit metastability In metastability both inverters in the
master latch are in the high gain region
The inverters are modelled as amplifiers
Their gain is –A, the output impedance is R (small signal)
C is the parasitic capacitance from gates and drains
Behaviour to exit metastability A small difference exists between Va
and Vb (due to initialization or noise)
This difference is quickly amplified by the positive feedback
The master latch quickly resolves to a stable state
However, this is not as quick as normal operation where input is DRIVEN and DRIVING the inverter pair
Notes on factors affecting metastabilityexit C, A, and R affect the time to exit metastability by affecting
the time constant C is the parasitic capacitance and as a first order scales with
technology Three dimensional effect and wiring caps are affecting this trend
R and A are equal to the analog gm=A/R The time constant was generally expected to drop with
technology, doesn’t scale well lately
Graph of exiting metastability For a long time after the
edge the voltage remains constant!
Where is the exponential growth?
The linear scale is fooling us
The visible part of the growth is not the exponential part
Log scale graph of exiting metastability
Exponential growth happens with the edge We exit the exponential regime once the inverters enter
their low gain regions (passing Vih or Vil) Thus we trace a time between initial voltage difference Vo
and entering stable regions at voltage V1
Time to exit metastability
The time to exit metastability depends logarithmically on the entering voltage Vo
V1 is a constant The larger the Vo, the shorter the time If Vo = V1 then time=0 (there is no metastability If Vo=0 time=infinity (impossible due to noise) Thus metastability is NOT being exactly in the middle and
exiting randomly, it is entering the high gain region and exiting in a known but long time
Probability of remaining in metastability In general we have no idea how large Vo is It is affected by the exact conditions when the edge occured Also depends on noise and coupled signals!
We do know though that voltage difference grows exponentially
So we can obtain a value for the PROBABILITY that the latch or flip flop is still metastable at a time S
/
If a flip-flop enters metastabilitythe probability it is still metastable at a timet>0 later is:
te τ−
Synchronization We can NEVER guarantee that asynchronous signals will not
cause metastability In fact we can GUARANTEE they will cause metastability at
a certain rate The best we can do is design a scheme that makes the
probability of metastability extremely low
Condition for failure Now assume we are sampling a metastable circuit at a time S
after it enters metastability i.e. At a time S after the clock edge that caused metastability
We define a failure as the signal remaining metastable at that later time we are sampling at
This is a two component event: We enter metastability At a time S later we are still in metastability
Probability and rate of entering metastability Remember probability of metastability is TwFc (Tw is the
critical time around clock edge and Fc is cock frequenecy) And rate of metastability if FdTw Fc (Fd is the rate of data
change)
MTBF The inverse of rate is time The inverse of rate of failure is Mean Time Between Failures
(MTBF) Our aim is thus to increase MTBF as much as we can
Solution to metastability?
Let’s assume we give the circuit one full cycle to resolve from metastability before sampling
This means S=one complete cycle time Using typical values for 28nm technology (33nm modified) Time constant = 10ps, Tw=20ps, Fc=1GHz, Fd=100MHz, and
S=complete cycle MTBF=4X10^29 years If the circuit fails due to metastability, the next failure
happens probably after the end of the world! That’s safe enough
Observation of metastability at output
If node A is metastable, what do we see at output? Q is three inverters later We will not see an undefined value, we will see either ‘1’ or
‘0’
Observation of metastability at output
However the ‘1’ or ‘0’ may switch before settling to the correct value
It will take longer than Tcq for the correct value to appear on Q
This is the true definition of metastability at output port
Two flip flop synchronization circuit
There is a finite and high chance that Q1 is metastable often However, FF2 samples this output S later (one cycle late) The probability that Q2 will go metastable is very low We must substract the following from S: Wire delay, setup time of FF2
Scenarios in the 2 FF synchronization circuit
Scenario a: FF1 catches the correct value of D1 in cycle 1 FF2 samples this value in cycle 2 The value appears at the output of FF2 in cycle 2 Q2 is correct in cycle 3
Scenarios in the 2 FF synchronization circuit
Scenario b: FF1 completely misses the 1 on D1 Q1 remains at 0 FF1 samples correct D1 in cycle 2 Q2 is correct in cycle 3
Scenarios in the 2 FF synchronization circuit
Scenario c: FF1 goes metastable Q1 resolves to correct value but very slowly However, there is negligible probability it will fail to resolve
before next edge (S) FF2 samples correct value in cycle 2 Q2 is correct in cycle 3 (except every MTBF years)
Scenarios in the 2 FF synchronization circuit
Scenario d: FF1 goes metastable Q1 resolves to incorrect value (logic low) Difference between cases b and d? (missing the value or
resolving metastability wrong) However, on cycle 2 the correct logic value is registered FF2 samples the correct value from Q1 a cycle later Q2 is correct in cycle 3
Scenarios in the 2 FF synchronization circuit
Scenario e: FF1 goes metastable and goes high FF1 glitches and resolves metastability at 0 FF1 samples correct value comfortably in cycle 2 Q2 is correct in cycle 3
Scenarios in the 2 FF synchronization circuit
Scenario f: FF1 goes metastable and its output shoots to high Q1 resolves to correct value (maintaining high) FF2 samples correct value in cycle 2 Q2 is correct in cycle 3
Scenarios Data can be: Missed Caught Metastably sampled Resolves to correct value Resolves to wrong value Shoots to either values then resolves to either values
The only common factor between all scenarios is that one cycle later the second flip flop CERTAINLY (or at least within an MTBF) samples correctly
Turning the synchronization circuit into a synchronizer If data is to be sampled correctly, it must be maintained on
input for up to two cycles (sometimes one!) How do we know how long to maintain?
Turning the synchronization circuit into a synchronizer The sender should put its data on a bus and raise the request
control signal The receiver sees the request and reads the data, raising
acknowledge. Sender can now change data and re-raise req
Turning the synchronization circuit into a synchronizer But req is raised at the transmitter clock, ack is raised at the
receiver clock! They could become metastable We are synchronizing the two controls through 2 FF
synchronizers Worst case is a control is read a cycle late No problem! Conservative!
Turning the synchronization circuit into a synchronizer Note data is NEVER passed through synchronizers FF synchronizers are ONLY used with control signals Data is on busses Synchronizers have a one cycle uncertainty on when they
manage to synchronize
Turning the synchronization circuit into a synchronizer It is certain that some bits of the bus synchronize in the first
cycle and others in the second cycle One bit wrong on the whole bus is a failure of the whole data
bus
Overhead of synchronization It takes 2 cycles (always consider worst case) to synchronize
req 2 cycles to synchronize ack 1 cycle at each side (at least) to read data Now if another transfer requires a similar change in controls
then req and ack must be lowered to get ready for the next cycle
Transmitter reads ack high (2 cycles) and lowers req (1 cycle) Receiver sees req going low (2 cycles) and lowers ack (1
cycle) Overhead is 2+2+1+1+2+2+1+1=12 cycles to transmit
one word!!
Asynchronous FIFO for high payload transfer In case we have a lot of data words to transfer in bulk the best
approach is the simplest We will use a two port two clock asynchronous FIFO FIFO = First In First Out It is simply a shift register However, it has indicators for EMPTY and FULL It is two port because one port writes data, the other reads If the two ports are controlled by different clocks, it is
asynchronous
Two port FIFO
Simply a two port RAM The write port has a write pointer (last written address) The read port has a read pointer (last read address) Reads and writes can only be sequential
Empty and Full, what do they mean? Empty indicates there is no more data to read Empty is significant to the receiver Empty tells the receiver to stop reading until Empty goes low
Full indicates we have run out of space to write Full is significant to the transmitter Full tells the transmitter to stop writing until Full goes low
Empty and Full, how to calculate? Empty happens when the read and write pointers are equal But full also occurs when the read and write pointers are
full!!
INSERT FIGURE OF QUEUE BEING READ AND WRITTEN
Empty and Full, how to calculate? However Empty occurs when we have JUST read and the read and
write pointers are equal Full occurs when we have JUST written and the read and write
pointers are equal So if read is incremented and we get equal pointers EMPTY
If we write and we get equal pointers FULL
INSERT FIGURE OF QUEUE BEING READ AND WRITTEN
Who increments the pointers? Read is incremented by the receiver whenever it reads a new
word Write is incremented by the transmitter whenever it writes a
new word Full and Empty = NOT(Read-Write) Thus calculating both Full and Empty requires reading
pointers from two clock domains!
Synchronizing pointers
The two pointers must be passed through synchronization FF’s before going to the other side for comparison
Read pointer is generated by the receiver Write pointer is generated by the transmitter
Going EMPTY
The receiver is reading, but the transmitter is slower or has stopped writing
The receiver increments read pointer It reads write pointer through synch It raises empty flag and stops reading
Going EMPTY – Synch one cycle late
What happens if synch is one cycle late? Write pointer may have been incremented but we missed the
increment till one cycle later Means we raise empty when FIFO is not empty Safe, we unnecessarily stop reading for one cycle, no problem No scenario where we fail to raise empty when FIFO is empty
Going FULL
The transmitter is writing, but the receiver is slower or has stopped reading
The transmitter increments write pointer It reads Read pointer through synch It raises Full flag and stops Writing
Going FULL – Synch one cycle late
What happens if synch is one cycle late? Read pointer may have been incremented but we missed the
increment till one cycle later Means we raise Full when receiver has cleared one address Safe, we unnecessarily stop writing for one cycle, no problem No scenario where we fail to raise Full when FIFO is Full
Advantages of FIFO Much lower latency than normal synchronizers We have at most 2 cycles of being held up if the empty or full
flags are asserted Once reading or writing have resumed they are synchronous
with the port clock In other words there is no handshaking unless we hit the
empty or full conditions, while the read and write pointers are far, each port behaves like a synchronous circuit
We are cheating ... a little We previously said we shouldn’t use synchronizers with data
words Pointers are not single bit controls, they are data busses The pointer counters are Grey encoded, they do not simply
increment With Grey encoding only one bit changes every cycle Even if the synchronizer fails to synchronize one bit, it is
highly unlikely that the failed bit is the bit that has changed Usually works well