connecting high level models and rtl: an ongoing battle jesse bingham intel feb 25 2009
TRANSCRIPT
Big Picture
Architecture
RTL
Netlists
Layout/Backend
FPV
FEV
CDC, TOV
Diagram unapologetically stolen from Erik
This red arrow isthe problem de jour
Formal Verification (ideal case): full coverage of design space
Simulation: spot coverage of design space
Motivation for Formal Verification
Formal Verification (real life): full coverage in some areas
Also stolen from Erik
Another Dimension…
State/behavior coverage
Propertycoverage
FEV
ArithmeticFV @Intel
TraditionalSimulation-based
TestingBounded
ModelChecking
Protocol Model
checking
Type-checking
Formalspecification
Theorem proving
Today’sTopic
(formal)
Today’sTopic
(checker)
Overview
• Protocols naturally & succinctly specified by high level models (HLM)– In a sense, all RTL safety properties are captures by the HLM
• Actual HW design (RTL) is hand-written by engineers• How do we establish that RTL adheres to its HLM?
– What does adherence even mean mathematically?• Two approaches
– Checker: HDL code that “watches” the design during simulation, raises alarms if it detects non-adherence
• Most of this talk is about checkers– Formal Proof: prove that checker can never ever ring alarm
• having the checker is obviously a prerequisite for formal proof
• Notoriously hard problem in FV– but getting more and more important in HW design
HW Protocols
• Distributed components exchanging messages• Control Oriented• Cannot be specified by input/output relations• State is king• Typically message latency insensitive (though message
ordering often matters)• Naturally specified at high level using guarded command
languages (Murphi, TLA, Unity, etc)– we’ll call this the high level model (HLM)– we use Murphi, but this work is independent of the particular
modeling language
HLM: Guarded Commands[Dijkstra 1975]
• Guard: predicate on states• Command: function mapping states to states• Guarded Command (GC): a guard & a command
– Command is only allowed to fire if guard is true
• Called rules or rulesets in Murphi…
Rule “go to park” NOT raining==> location := nearest_park();end
Ruleset food : FOOD “have picnic” hungry AND NOT raining==> location := nearest_park(); eat(food);end
…
initial state enabled GC fires
HLM Behaviors & Properties
• State invariants: all reachable states are “okay”– Cache always has at most one entry for each address
• More general safety properties– Cache returns most recently written data to a read request
• Liveness (typically assuming fairness)– If you send a read request, cache will eventually return data
Register Transfer Level (RTL)• Clock/state accurate (or at least close)• Pipelines• Schedulers• Special logic
– Design-for-test
– Clock gating
– Reset
• Written in hardware description language like System Verilog or VHDL (we use SV)
• Can be formalizes as finite state automata or Kripke structures; we won’t do that today
FV methods and CAD tools below RTL have advanced to the point where one can (if they
choose to) safely think of RTL as the real Silicon
Refinement Map
• A function RM taking RTL states to HLM states is called a refinement map– Intuitively, RM(r) is the HLM state that summarizes RTL state r– Many-to-one in general – Human writes this in our methodology
• Generalization: RM depend on RTL signals at fixed offsets from current cycle – Useful for dealing with RTL pipelines
HLM Behavior
RTL Behavior …
one RTL clock cyle
…
reset state
initial state
Refinement map
a guarded command fires
Behavioral Refinement
Each RTL clock cycle corresponds to zero or more guarded commands firing
HLM
RTL
one RTL clock cyle
Checking Refinement
…
r
RM(r)
r
RM(r)
(gc1, gc2 , gck)…GC_prediction(r) =
=?
Next
Cache Controller HLM
… … …Addr
Data
State {Invalid,Dirty,Clean}
CacheArray
Cpu2Cache
Cache2Mem
Cache2Cpu
Mem2Cache
Let’s pretendthese don’t exist
Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache);end
Cache Controller HLM GCRecv_Store
Cache Controller HLM GCEvict
Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid;end
Cache Controller RTLCpu2Cache
Cache2Mem
Cache State& Addr Array
EvictionLogic
Hit?Pipe stage 1
Pipe stage 2
Cache Data Array
Store with EvictionCpu2Cache
Cache2Mem
Cache State& Addr Array
EvictionLogic
Hit?Pipe stage 1
Pipe stage 2
Store(A0,D0)
Cache Data Array
Dirty,A1Store(A0,D0)
Store(A0,D0)WriteBack(A1,D1)
D1Dirty,A0 D0
WriteBack(A1,D1)
Store with Eviction Revisited
Cache State& Addr Array
EvictionLogic
Hit?Pipe stage 1
Pipe stage 2
Store(A0,D0)
Cache Data Array
Dirty,A1Store(A0,D0)
Store(A0,D0)WriteBack(A1,D1)
D1Dirty,A0 D0
WriteBack(A1,D1)
When do the HLM GCs “happen” in the RTL?
Store
Evict
Key Point #1
Pipelining causes GCs that are atomic in the HLM to be non-atomic in the RTL.
This non-atomicity must be handled by the refinement map.
Key Point #2
In the HLM GCs are interleaved; while the RTL can exhibit true GC concurrency.
This must be resolved by the GC prediction.
Cache Controller Refinement Map(conceptual)
function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr; HLM.CacheArray[].Data = RTL.DataArray[]@+1; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu@+1; return(HLM);end; <signal>@k denotes the
value <signal> will have k clock cycles in the future(k can be negative too,
to refer to the past)
Cache Controller Refinement Map(with only non-positive temporal offsets)
function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State@-1; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr@-1; HLM.CacheArray[].Data = RTL.DataArray[]; HLM.Cpu2Cache = RTL.Cpu2Cache@-2; HLM.Cache2Cpu = RTL.Cache2Cpu@; return(HLM);end;
<signa>@-k can be constructed using System Verilog’s $past operator
Store with Eviction Re-Revisited
Cache State& Addr ArrayPipe stage 1
Pipe stage 2
Store(A0,D0)
Cache Data Array
Dirty,A1Store(A0,D0)
Store(A0,D0)WriteBack(A1,D1)
D1Dirty,A0 D0
WriteBack(A1,D1)
HLM
RTL
Evict RecvStore
Cache Controller GC Prediction
function HLM_STATE Next_HLM_STATE(HLM_STATE hs); if (RTL.Cpu2Cache.Valid@-2) begin i = get_target_cache_index()@-2; if (will_need_eviction()@-2) hs = Evict(hs,i); if (RTL.Cpu2Cache.Op@-2 = STORE) hs = Recv_Store(hs,i); else if (RTL.Cpu2Cache.Op@-2 = LOAD) hs = Recv_Load(hs,i); end; ... // figure out when to fire Send_Memory_Request // and Recv_Memory_Responseend;
Can result in 0, 1, or 2GCs fired
Back-to-back Stores with Eviction
State& Addr ArrayPipe stage 1
Pipe stage 2
Store(A0,D0)
Data Array
Dirty,A1Store(A0,D0)
Store(A0,D0)WriteBack(A1,D1)
D1Dirty,A0 D0
WriteBack(A1,D1)
HLM
RTL
Evict RecvStore(A0)
Store(A2,D2)
Store(A2,D2)D2Dirty,A2
Store(A2,D2)
RecvStore(A1)
FYI, we do everything inSystem Verilog
• Actual design under verificaiton– written by HW designers
• Test bench– written by HW validators
• HLM– written in Murphi by FV team in consultation with Architects– compiled into SV by a tool we wrote
• Refinement Map– hand-written in SV by FV team
• GC Prediction – hand-written in SV by FV team
HLM
RTL
one RTL clock cyle
Formal Proof of Refinementversion 1.0: looks like FEV
RM()
RM() RM() = Next(RM())
Totally symbolic RTL state;(represents all possible
RTL states)
Next(RM())
This will most certainlyfail for some unreachable
RTL states! Rats!
?
Can be decided bySAT- or BDD-based
solver engine
Also might blow-up
HLM
RTL
one RTL clock cyle
Formal Proof of Refinementversion 2.0: write an invariant
RM()
RM() Inv() RM() = Next(RM())
Totally symbolic RTL state;(represents all possible
RTL states)
Next(RM())
Can be decided bySAT- or BDD-based
solver engine
But concocting Invis difficult, not to
mention you need to alsoprove Inv is invariant
Also might blow-up
Formal Proof of Refinementversion 3.0: Model Checking
Will likely blow-up;Probably need to restrict
behaviors; e.g. use 4 addresses rather than 232
RTL&
checker
HLM of Environment
• start from initial state of env-HLM & RTL• compute forward reachability via
symbolic model checking• verify that checker never fires.
Open Problems
• Refinement map is part of spec… or is it?• Formal proof: best approach?
– I spent 1.5 years banging my head on the formal side; the fact that I’ve retreated to checkers says something
• Tool issues: pain in the butt– Generated System Verilog has hit 4 bugs so far in expensive
third-party simulator
• HLM/RTL discrepancies: can we weaken our notion of refinement to allow for reasonable mismatches?– E.g. HLM transmits message instantaneously, while RTL
scheduling causes arbitrary delay before transmission
Partial Bibliography• Using formal HLM as a checker:
– Linking simulation with Formal Verification at a Higher Level, Tasiran, Batson, & Yu, 2004– Runtime Refinement Checking of Concurrent Data Structures, Tasiran & Qadeer, 2004
• Original Murphi paper: – Protocol Verification as a Hardware Design Aid, Dill, Drexler, Hu, & Yang, 1992
• Formal verification of refinement maps for hardware– Automatic Verification of Pipelined Microprocessor Control, Burch & Dill, 1994– Protocol Verification by Aggregation of Distributed Transactions, Park & Dill, 1996– A Methodology for Hardware Verification using Compositional Model Checking,
McMillan, 2000– The Formal Design of 1M-gate ASICs, Eiriksson, 2000
• Theory involving refinement in the face of fairness– On the Existence of Refinement Maps, Abadi & Lamport, 1991
• Commercial Tools– BlueSpec (BlueSpec Inc.) – Pico (Synfora)– SLEC (Calypto)
type ---- Type declarations ----
<snip>
CACHE_ENTRY : record State : enum {Invalid, Dirty, Clean}; Addr : ADDR; Data : DATA; end;
<snip>
var ---- State variables ----
CacheArray : array [0...CACHE_SIZE-1] of CACHE_ENTRY; Cpu2Cache : CPU2CACHE_MSG; Cache2Cpu : CACHE2CPU_MSG; Mem2Cache : MEM2CACHE_MSG; Cache2Mem : CACHE2MEM_MSG;
Cache Controller HLM(typedefs & var decls in Murphi)
Guarded Commands Formalized
• State space S = type consistent assignments to variables• Init: subset of state space specifying initial states• A guarded command (GC) is a pair (g,c), where
– g : S {True,False} is called the guard; GC is enabled in state s if g(s) = True
– c : S S is called the command; GC fires from s to c(s)• Semantics: HLM can transition from s to s iff there exists a GC that
– is enabled in s– fires from s to s
• Nondeterminism arrises when multiple GCs are enabled• In practice GCs are often parameterized• We assume that the stuttering GC (s.True , s.s ) is implicit
Refinement Formalized
• Let H and R be respective state spaces of HLM and RTL• A function RM: R H is called a refinement map
– Intuitively, RM(r) is the HLM state that summarizes RTL state r– Many-to-one in general – Human writes this in our methodology
• We generalize this so that RM: Rw H , for some fixed w– Hence RM maps a fixed length sequence of RTL states to H– Useful for dealing with RTL pipelines
Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache);
Ruleset i : CacheIndex “Recv Load" Cpu2Cache.opcode = Load & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> Cache2Cpu.Data := CacheArry[i].Data; Absorb(Cpu2Cache);
Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid;
Cache Controller HLM GCs (1/2)
Ruleset i : CacheIndex ; a : Addr “Send Memory Request" CacheArry[i].State = Invalid==> Cache2Mem.opcode := Get; Cache2Mem.Index := i; Cache2Mem.Addr = a;end
Ruleset i : CacheIndex “Recv Memory Response" Mem2Cache.opcode = Response==> CacheArry[Mem2Cache.Index].Data := Mem2Cache.Data; CacheArry[Mem2Cache.Index].Addr := Mem2Cache.Addr; CacheArry[Mem2Cache.Index].State := Clean; Absorb(Mem2Cache);end
Cache Controller HLM GCs (2/2)
Load Miss (moot)Cpu2Cache Cache2Cpu
Cache2Mem Mem2Cache
Cache State& Addr Array
EvictionLogic
Hit?Pipe stage 1
Pipe stage 2
Load(A0)
Cache Data Array
Get(A0)
Get(A0)
Get(A0)Response(A0,D0)
Clean,A0 D0
Response(D0)
Cache Controller Refinement Map(conceptual)
function HLM_STATE RM(); // refinement map function HLM_STATE HLM; for (int i=0 ;i < CACHE_SIZE; i++) begin HLM.CacheArray[i].State = RTL.AddrArray[i].State; HLM.CacheArray[i].Addr = RTL.AddrArray[i].Addr; HLM.CacheArray[i].Data = RTL.DataArray[i]@+1; end; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu; return(HLM);end;
<signal>@k denotes the value <signal> will have k clock cycles in the future(k can be negative too)