1.metastability

58
Metastability Cairo University Faculty of Engineering Department of Electronics and Electrical Communications Dr. Karim Ossama Abbas 2012

Upload: 0106062007

Post on 26-Jan-2016

2 views

Category:

Documents


0 download

DESCRIPTION

CMOS inverter problems for digital design

TRANSCRIPT

Metastability

Cairo University Faculty of EngineeringDepartment of Electronics and Electrical CommunicationsDr. Karim Ossama Abbas2012

Contents Definition of metastability Entering metastability Control signal synchronizers Reliability of synchronization Mesonchronous and multisynchronous systems

Main reference: Metastability and Synchronizers: A Tutorial, Ran Ginosar, IEEE CASS and CS

Metastable point

A metastable point is one that is strictly stable However it is different from truly stable states 1 and 2 The metastable point is extremely sensitive It will inevitably settle on either stable points 1 or 2 if left

alone Noise alone is enough to do this

Metastability in logic As far as inverters are concerned: Metastability is being in the high gain region, particularly close

to the logic threshold The stable states are the low gain regions “1” and “0” logic

regions, below and above VIL and VIH

How do we enter metastability We enter metastability if data and clock are switched

simultaneously or too close together This is equivalent to a violation of setup or hold time This is almost inevitable if we cross clock domains without

additional measures

Metastability inside a flip flop

CLK is 0 and D changes from 0 to 1 Node A is initially at 0 Node A is supposed to charge to 1 We have to give enough time for this to happen (setup time)

Metastability inside a flip flop

CLK changes too soon from 0 to 1 If this happens while A is midway through charging Then B is also midway through discharging A and B drive the inverters into a metastable state

Metastability inside a flip flop

The two inverters WILL exit this metastable state through their positive feedback

However, it will take them too much time to exit

Metastability inside a flip flop

Metastability: When it takes longer than Tcq for data to appear at the output of a latch

Note that many people mistake metastability for wrong or invalid data appearing on output

Waveforms of a metastable latch

The input waveforms are moved closer to the clock by steps of 100ps

The closer the waveform, the longer it takes the output to settle

All must eventually settle

Waveforms of a metastable latch

The input waveforms are moved closer to the clock by steps of 1ps

Again takes longer to settle the closer we are Which direction we settle depends on noise and how close to

inverter threshold we are

Practical situations with metastability Metastability may seem like an easy situation to avoid However it is very common Most commonly seen while crossing clock domains For example consider an FFT circuit passing data to a

baseband demodulator Both use different and unrelated clocks Because the frequencies are independent there will be

periodic instances where data and clock edges are too close together

Setup and hold times are periodically violated for the receiver

Probability of metastability First we need to know how often the latch enters

metastability Assume that for metastability to occur a change in signal

must happen in a certain window around the edge (e.g. Setup+hold times) Tw.

Now let’s assume a uniform distribution of change within the clock period (assuming e.g. Two independent clock domains) Tc

Thus probability of entering metastability is Tw/Tc=TwFc

Rate of metastability D does not change every cycle Assume D changes at a rate Fd < Fc Then rate of metastability is FdTwFc Consider the following example: Fc=1GHz Fd=100MHz (D changes on every tenth cycle) Tw=20ps (typical value)

Rate=2,000,000 times/sec = TWICE per MICROSECOND

What does the metastability rate mean? Metastability is a failure mode We get logic values that we don’t expect We may get the logic values we expect but after a longer than

expected delay Thus the following registers store the wrong values This error propagates and the circuit completely fails If this happens twice per microsecond, then the circuit is

completely useless

Circuit model to exit metastability In metastability both inverters in the

master latch are in the high gain region

The inverters are modelled as amplifiers

Their gain is –A, the output impedance is R (small signal)

C is the parasitic capacitance from gates and drains

Behaviour to exit metastability A small difference exists between Va

and Vb (due to initialization or noise)

This difference is quickly amplified by the positive feedback

The master latch quickly resolves to a stable state

However, this is not as quick as normal operation where input is DRIVEN and DRIVING the inverter pair

Mathematical behaviour of inverter pair

Notes on factors affecting metastabilityexit C, A, and R affect the time to exit metastability by affecting

the time constant C is the parasitic capacitance and as a first order scales with

technology Three dimensional effect and wiring caps are affecting this trend

R and A are equal to the analog gm=A/R The time constant was generally expected to drop with

technology, doesn’t scale well lately

Graph of exiting metastability For a long time after the

edge the voltage remains constant!

Where is the exponential growth?

The linear scale is fooling us

The visible part of the growth is not the exponential part

Log scale graph of exiting metastability

Exponential growth happens with the edge We exit the exponential regime once the inverters enter

their low gain regions (passing Vih or Vil) Thus we trace a time between initial voltage difference Vo

and entering stable regions at voltage V1

Time to exit metastability

The time to exit metastability depends logarithmically on the entering voltage Vo

V1 is a constant The larger the Vo, the shorter the time If Vo = V1 then time=0 (there is no metastability If Vo=0 time=infinity (impossible due to noise) Thus metastability is NOT being exactly in the middle and

exiting randomly, it is entering the high gain region and exiting in a known but long time

Time to exit is a log function of starting voltage difference

Probability of remaining in metastability In general we have no idea how large Vo is It is affected by the exact conditions when the edge occured Also depends on noise and coupled signals!

We do know though that voltage difference grows exponentially

So we can obtain a value for the PROBABILITY that the latch or flip flop is still metastable at a time S

/

If a flip-flop enters metastabilitythe probability it is still metastable at a timet>0 later is:

te τ−

Synchronization We can NEVER guarantee that asynchronous signals will not

cause metastability In fact we can GUARANTEE they will cause metastability at

a certain rate The best we can do is design a scheme that makes the

probability of metastability extremely low

Condition for failure Now assume we are sampling a metastable circuit at a time S

after it enters metastability i.e. At a time S after the clock edge that caused metastability

We define a failure as the signal remaining metastable at that later time we are sampling at

This is a two component event: We enter metastability At a time S later we are still in metastability

Probability and rate of entering metastability Remember probability of metastability is TwFc (Tw is the

critical time around clock edge and Fc is cock frequenecy) And rate of metastability if FdTw Fc (Fd is the rate of data

change)

MTBF The inverse of rate is time The inverse of rate of failure is Mean Time Between Failures

(MTBF) Our aim is thus to increase MTBF as much as we can

Solution to metastability?

Let’s assume we give the circuit one full cycle to resolve from metastability before sampling

This means S=one complete cycle time Using typical values for 28nm technology (33nm modified) Time constant = 10ps, Tw=20ps, Fc=1GHz, Fd=100MHz, and

S=complete cycle MTBF=4X10^29 years If the circuit fails due to metastability, the next failure

happens probably after the end of the world! That’s safe enough

Observation of metastability at output

If node A is metastable, what do we see at output? Q is three inverters later We will not see an undefined value, we will see either ‘1’ or

‘0’

Observation of metastability at output

However the ‘1’ or ‘0’ may switch before settling to the correct value

It will take longer than Tcq for the correct value to appear on Q

This is the true definition of metastability at output port

Two flip flop synchronization circuit

There is a finite and high chance that Q1 is metastable often However, FF2 samples this output S later (one cycle late) The probability that Q2 will go metastable is very low We must substract the following from S: Wire delay, setup time of FF2

Scenarios in the 2 FF synchronization circuit

Scenario a: FF1 catches the correct value of D1 in cycle 1 FF2 samples this value in cycle 2 The value appears at the output of FF2 in cycle 2 Q2 is correct in cycle 3

Scenarios in the 2 FF synchronization circuit

Scenario b: FF1 completely misses the 1 on D1 Q1 remains at 0 FF1 samples correct D1 in cycle 2 Q2 is correct in cycle 3

Scenarios in the 2 FF synchronization circuit

Scenario c: FF1 goes metastable Q1 resolves to correct value but very slowly However, there is negligible probability it will fail to resolve

before next edge (S) FF2 samples correct value in cycle 2 Q2 is correct in cycle 3 (except every MTBF years)

Scenarios in the 2 FF synchronization circuit

Scenario d: FF1 goes metastable Q1 resolves to incorrect value (logic low) Difference between cases b and d? (missing the value or

resolving metastability wrong) However, on cycle 2 the correct logic value is registered FF2 samples the correct value from Q1 a cycle later Q2 is correct in cycle 3

Scenarios in the 2 FF synchronization circuit

Scenario e: FF1 goes metastable and goes high FF1 glitches and resolves metastability at 0 FF1 samples correct value comfortably in cycle 2 Q2 is correct in cycle 3

Scenarios in the 2 FF synchronization circuit

Scenario f: FF1 goes metastable and its output shoots to high Q1 resolves to correct value (maintaining high) FF2 samples correct value in cycle 2 Q2 is correct in cycle 3

Scenarios Data can be: Missed Caught Metastably sampled Resolves to correct value Resolves to wrong value Shoots to either values then resolves to either values

The only common factor between all scenarios is that one cycle later the second flip flop CERTAINLY (or at least within an MTBF) samples correctly

Turning the synchronization circuit into a synchronizer If data is to be sampled correctly, it must be maintained on

input for up to two cycles (sometimes one!) How do we know how long to maintain?

Turning the synchronization circuit into a synchronizer The sender should put its data on a bus and raise the request

control signal The receiver sees the request and reads the data, raising

acknowledge. Sender can now change data and re-raise req

Turning the synchronization circuit into a synchronizer But req is raised at the transmitter clock, ack is raised at the

receiver clock! They could become metastable We are synchronizing the two controls through 2 FF

synchronizers Worst case is a control is read a cycle late No problem! Conservative!

Turning the synchronization circuit into a synchronizer Note data is NEVER passed through synchronizers FF synchronizers are ONLY used with control signals Data is on busses Synchronizers have a one cycle uncertainty on when they

manage to synchronize

Turning the synchronization circuit into a synchronizer It is certain that some bits of the bus synchronize in the first

cycle and others in the second cycle One bit wrong on the whole bus is a failure of the whole data

bus

Overhead of synchronization It takes 2 cycles (always consider worst case) to synchronize

req 2 cycles to synchronize ack 1 cycle at each side (at least) to read data Now if another transfer requires a similar change in controls

then req and ack must be lowered to get ready for the next cycle

Transmitter reads ack high (2 cycles) and lowers req (1 cycle) Receiver sees req going low (2 cycles) and lowers ack (1

cycle) Overhead is 2+2+1+1+2+2+1+1=12 cycles to transmit

one word!!

Asynchronous FIFO for high payload transfer In case we have a lot of data words to transfer in bulk the best

approach is the simplest We will use a two port two clock asynchronous FIFO FIFO = First In First Out It is simply a shift register However, it has indicators for EMPTY and FULL It is two port because one port writes data, the other reads If the two ports are controlled by different clocks, it is

asynchronous

Two port FIFO

Simply a two port RAM The write port has a write pointer (last written address) The read port has a read pointer (last read address) Reads and writes can only be sequential

Empty and Full, what do they mean? Empty indicates there is no more data to read Empty is significant to the receiver Empty tells the receiver to stop reading until Empty goes low

Full indicates we have run out of space to write Full is significant to the transmitter Full tells the transmitter to stop writing until Full goes low

Empty and Full, how to calculate? Empty happens when the read and write pointers are equal But full also occurs when the read and write pointers are

full!!

INSERT FIGURE OF QUEUE BEING READ AND WRITTEN

Empty and Full, how to calculate? However Empty occurs when we have JUST read and the read and

write pointers are equal Full occurs when we have JUST written and the read and write

pointers are equal So if read is incremented and we get equal pointers EMPTY

If we write and we get equal pointers FULL

INSERT FIGURE OF QUEUE BEING READ AND WRITTEN

Who increments the pointers? Read is incremented by the receiver whenever it reads a new

word Write is incremented by the transmitter whenever it writes a

new word Full and Empty = NOT(Read-Write) Thus calculating both Full and Empty requires reading

pointers from two clock domains!

Synchronizing pointers

The two pointers must be passed through synchronization FF’s before going to the other side for comparison

Read pointer is generated by the receiver Write pointer is generated by the transmitter

Going EMPTY

The receiver is reading, but the transmitter is slower or has stopped writing

The receiver increments read pointer It reads write pointer through synch It raises empty flag and stops reading

Going EMPTY – Synch one cycle late

What happens if synch is one cycle late? Write pointer may have been incremented but we missed the

increment till one cycle later Means we raise empty when FIFO is not empty Safe, we unnecessarily stop reading for one cycle, no problem No scenario where we fail to raise empty when FIFO is empty

Going FULL

The transmitter is writing, but the receiver is slower or has stopped reading

The transmitter increments write pointer It reads Read pointer through synch It raises Full flag and stops Writing

Going FULL – Synch one cycle late

What happens if synch is one cycle late? Read pointer may have been incremented but we missed the

increment till one cycle later Means we raise Full when receiver has cleared one address Safe, we unnecessarily stop writing for one cycle, no problem No scenario where we fail to raise Full when FIFO is Full

Advantages of FIFO Much lower latency than normal synchronizers We have at most 2 cycles of being held up if the empty or full

flags are asserted Once reading or writing have resumed they are synchronous

with the port clock In other words there is no handshaking unless we hit the

empty or full conditions, while the read and write pointers are far, each port behaves like a synchronous circuit

We are cheating ... a little We previously said we shouldn’t use synchronizers with data

words Pointers are not single bit controls, they are data busses The pointer counters are Grey encoded, they do not simply

increment With Grey encoding only one bit changes every cycle Even if the synchronizer fails to synchronize one bit, it is

highly unlikely that the failed bit is the bit that has changed Usually works well