connecting high level models and rtl: an ongoing battle jesse bingham intel feb 25 2009

Connecting High Level Models and RTL: an Ongoing Battle

Jesse Bingham

Intel

Feb 25 2009

Big Picture

Architecture

RTL

Netlists

Layout/Backend

FPV

FEV

CDC, TOV

Diagram unapologetically stolen from Erik

This red arrow isthe problem de jour

Formal Verification (ideal case): full coverage of design space

Simulation: spot coverage of design space

Motivation for Formal Verification

Formal Verification (real life): full coverage in some areas

Also stolen from Erik

Another Dimension…

State/behavior coverage

Propertycoverage

FEV

ArithmeticFV @Intel

TraditionalSimulation-based

TestingBounded

ModelChecking

Protocol Model

checking

Type-checking

Formalspecification

Theorem proving

Today’sTopic

(formal)

Today’sTopic

(checker)

Overview

• Protocols naturally & succinctly specified by high level models (HLM)– In a sense, all RTL safety properties are captures by the HLM

• Actual HW design (RTL) is hand-written by engineers• How do we establish that RTL adheres to its HLM?

– What does adherence even mean mathematically?• Two approaches

– Checker: HDL code that “watches” the design during simulation, raises alarms if it detects non-adherence

• Most of this talk is about checkers– Formal Proof: prove that checker can never ever ring alarm

• having the checker is obviously a prerequisite for formal proof

• Notoriously hard problem in FV– but getting more and more important in HW design

HW Protocols

• Distributed components exchanging messages• Control Oriented• Cannot be specified by input/output relations• State is king• Typically message latency insensitive (though message

ordering often matters)• Naturally specified at high level using guarded command

languages (Murphi, TLA, Unity, etc)– we’ll call this the high level model (HLM)– we use Murphi, but this work is independent of the particular

modeling language

HLM: Guarded Commands[Dijkstra 1975]

• Guard: predicate on states• Command: function mapping states to states• Guarded Command (GC): a guard & a command

– Command is only allowed to fire if guard is true

• Called rules or rulesets in Murphi…

Rule “go to park” NOT raining==> location := nearest_park();end

Ruleset food : FOOD “have picnic” hungry AND NOT raining==> location := nearest_park(); eat(food);end

…

initial state enabled GC fires

HLM Behaviors & Properties

• State invariants: all reachable states are “okay”– Cache always has at most one entry for each address

• More general safety properties– Cache returns most recently written data to a read request

• Liveness (typically assuming fairness)– If you send a read request, cache will eventually return data

Register Transfer Level (RTL)• Clock/state accurate (or at least close)• Pipelines• Schedulers• Special logic

– Design-for-test

– Clock gating

– Reset

• Written in hardware description language like System Verilog or VHDL (we use SV)

• Can be formalizes as finite state automata or Kripke structures; we won’t do that today

FV methods and CAD tools below RTL have advanced to the point where one can (if they

choose to) safely think of RTL as the real Silicon

Refinement Map

• A function RM taking RTL states to HLM states is called a refinement map– Intuitively, RM(r) is the HLM state that summarizes RTL state r– Many-to-one in general – Human writes this in our methodology

• Generalization: RM depend on RTL signals at fixed offsets from current cycle – Useful for dealing with RTL pipelines

HLM Behavior

RTL Behavior …

one RTL clock cyle

…

reset state

initial state

Refinement map

a guarded command fires

Behavioral Refinement

Each RTL clock cycle corresponds to zero or more guarded commands firing

HLM

RTL

one RTL clock cyle

Checking Refinement

…

r

RM(r)

r

RM(r)

(gc1, gc2 , gck)…GC_prediction(r) =

=?

Next

Running Example:Toy Cache Controller

Cache Controller

Main Memory

CPU

Cache Controller HLM

… … …Addr

Data

State {Invalid,Dirty,Clean}

CacheArray

Cpu2Cache

Cache2Mem

Cache2Cpu

Mem2Cache

Let’s pretendthese don’t exist

Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache);end

Cache Controller HLM GCRecv_Store

Cache Controller HLM GCEvict

Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid;end

Cache Controller RTLCpu2Cache

Cache2Mem

Cache State& Addr Array

EvictionLogic

Hit?Pipe stage 1

Pipe stage 2

Cache Data Array

Store with EvictionCpu2Cache

Cache2Mem


EvictionLogic

Hit?Pipe stage 1

Pipe stage 2

Store(A0,D0)

Cache Data Array

Dirty,A1Store(A0,D0)

Store(A0,D0)WriteBack(A1,D1)

D1Dirty,A0 D0

WriteBack(A1,D1)

Store with Eviction Revisited


EvictionLogic

Hit?Pipe stage 1

Pipe stage 2

Store(A0,D0)

Cache Data Array



D1Dirty,A0 D0

WriteBack(A1,D1)

When do the HLM GCs “happen” in the RTL?

Store

Evict

Key Point #1

Pipelining causes GCs that are atomic in the HLM to be non-atomic in the RTL.

This non-atomicity must be handled by the refinement map.

Key Point #2

In the HLM GCs are interleaved; while the RTL can exhibit true GC concurrency.

This must be resolved by the GC prediction.

Cache Controller Refinement Map(conceptual)

function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr; HLM.CacheArray[].Data = RTL.DataArray[]@+1; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu@+1; return(HLM);end; <signal>@k denotes the

value <signal> will have k clock cycles in the future(k can be negative too,

to refer to the past)

Cache Controller Refinement Map(with only non-positive temporal offsets)

function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State@-1; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr@-1; HLM.CacheArray[].Data = RTL.DataArray[]; HLM.Cpu2Cache = RTL.Cpu2Cache@-2; HLM.Cache2Cpu = RTL.Cache2Cpu@; return(HLM);end;

<signa>@-k can be constructed using System Verilog’s $past operator

Store with Eviction Re-Revisited

Cache State& Addr ArrayPipe stage 1

Pipe stage 2

Store(A0,D0)

Cache Data Array



D1Dirty,A0 D0

WriteBack(A1,D1)

HLM

RTL

Evict RecvStore

Cache Controller GC Prediction

function HLM_STATE Next_HLM_STATE(HLM_STATE hs); if (RTL.Cpu2Cache.Valid@-2) begin i = get_target_cache_index()@-2; if (will_need_eviction()@-2) hs = Evict(hs,i); if (RTL.Cpu2Cache.Op@-2 = STORE) hs = Recv_Store(hs,i); else if (RTL.Cpu2Cache.Op@-2 = LOAD) hs = Recv_Load(hs,i); end; ... // figure out when to fire Send_Memory_Request // and Recv_Memory_Responseend;

Can result in 0, 1, or 2GCs fired

Back-to-back Stores with Eviction

State& Addr ArrayPipe stage 1

Pipe stage 2

Store(A0,D0)

Data Array



D1Dirty,A0 D0

WriteBack(A1,D1)

HLM

RTL

Evict RecvStore(A0)

Store(A2,D2)

Store(A2,D2)D2Dirty,A2

Store(A2,D2)

RecvStore(A1)

FYI, we do everything inSystem Verilog

• Actual design under verificaiton– written by HW designers

• Test bench– written by HW validators

• HLM– written in Murphi by FV team in consultation with Architects– compiled into SV by a tool we wrote

• Refinement Map– hand-written in SV by FV team

• GC Prediction – hand-written in SV by FV team

Formal Proofof Refinement

HLM

RTL

one RTL clock cyle

Formal Proof of Refinementversion 1.0: looks like FEV

RM()

RM() RM() = Next(RM())

Totally symbolic RTL state;(represents all possible

RTL states)

Next(RM())

This will most certainlyfail for some unreachable

RTL states! Rats!

?

Can be decided bySAT- or BDD-based

solver engine

Also might blow-up

HLM

RTL

one RTL clock cyle

Formal Proof of Refinementversion 2.0: write an invariant

RM()

RM() Inv() RM() = Next(RM())

Totally symbolic RTL state;(represents all possible

RTL states)

Next(RM())

Can be decided bySAT- or BDD-based

solver engine

But concocting Invis difficult, not to

mention you need to alsoprove Inv is invariant

Also might blow-up

Formal Proof of Refinementversion 3.0: Model Checking

Will likely blow-up;Probably need to restrict

behaviors; e.g. use 4 addresses rather than 232

RTL&

checker

HLM of Environment

• start from initial state of env-HLM & RTL• compute forward reachability via

symbolic model checking• verify that checker never fires.

Open Problems

• Refinement map is part of spec… or is it?• Formal proof: best approach?

– I spent 1.5 years banging my head on the formal side; the fact that I’ve retreated to checkers says something

• Tool issues: pain in the butt– Generated System Verilog has hit 4 bugs so far in expensive

third-party simulator

• HLM/RTL discrepancies: can we weaken our notion of refinement to allow for reasonable mismatches?– E.g. HLM transmits message instantaneously, while RTL

scheduling causes arbitrary delay before transmission

Partial Bibliography• Using formal HLM as a checker:

– Linking simulation with Formal Verification at a Higher Level, Tasiran, Batson, & Yu, 2004– Runtime Refinement Checking of Concurrent Data Structures, Tasiran & Qadeer, 2004

• Original Murphi paper: – Protocol Verification as a Hardware Design Aid, Dill, Drexler, Hu, & Yang, 1992

• Formal verification of refinement maps for hardware– Automatic Verification of Pipelined Microprocessor Control, Burch & Dill, 1994– Protocol Verification by Aggregation of Distributed Transactions, Park & Dill, 1996– A Methodology for Hardware Verification using Compositional Model Checking,

McMillan, 2000– The Formal Design of 1M-gate ASICs, Eiriksson, 2000

• Theory involving refinement in the face of fairness– On the Existence of Refinement Maps, Abadi & Lamport, 1991

• Commercial Tools– BlueSpec (BlueSpec Inc.) – Pico (Synfora)– SLEC (Calypto)

Backups

type ---- Type declarations ----

<snip>

CACHE_ENTRY : record State : enum {Invalid, Dirty, Clean}; Addr : ADDR; Data : DATA; end;

<snip>

var ---- State variables ----

CacheArray : array [0...CACHE_SIZE-1] of CACHE_ENTRY; Cpu2Cache : CPU2CACHE_MSG; Cache2Cpu : CACHE2CPU_MSG; Mem2Cache : MEM2CACHE_MSG; Cache2Mem : CACHE2MEM_MSG;

Cache Controller HLM(typedefs & var decls in Murphi)

Guarded Commands Formalized

• State space S = type consistent assignments to variables• Init: subset of state space specifying initial states• A guarded command (GC) is a pair (g,c), where

– g : S {True,False} is called the guard; GC is enabled in state s if g(s) = True

– c : S S is called the command; GC fires from s to c(s)• Semantics: HLM can transition from s to s iff there exists a GC that

– is enabled in s– fires from s to s

• Nondeterminism arrises when multiple GCs are enabled• In practice GCs are often parameterized• We assume that the stuttering GC (s.True , s.s ) is implicit

Refinement Formalized

• Let H and R be respective state spaces of HLM and RTL• A function RM: R H is called a refinement map

– Intuitively, RM(r) is the HLM state that summarizes RTL state r– Many-to-one in general – Human writes this in our methodology

• We generalize this so that RM: Rw H , for some fixed w– Hence RM maps a fixed length sequence of RTL states to H– Useful for dealing with RTL pipelines

Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache);

Ruleset i : CacheIndex “Recv Load" Cpu2Cache.opcode = Load & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr==> Cache2Cpu.Data := CacheArry[i].Data; Absorb(Cpu2Cache);

Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid;

Cache Controller HLM GCs (1/2)

Ruleset i : CacheIndex ; a : Addr “Send Memory Request" CacheArry[i].State = Invalid==> Cache2Mem.opcode := Get; Cache2Mem.Index := i; Cache2Mem.Addr = a;end

Ruleset i : CacheIndex “Recv Memory Response" Mem2Cache.opcode = Response==> CacheArry[Mem2Cache.Index].Data := Mem2Cache.Data; CacheArry[Mem2Cache.Index].Addr := Mem2Cache.Addr; CacheArry[Mem2Cache.Index].State := Clean; Absorb(Mem2Cache);end

Cache Controller HLM GCs (2/2)

Load Miss (moot)Cpu2Cache Cache2Cpu

Cache2Mem Mem2Cache


EvictionLogic

Hit?Pipe stage 1

Pipe stage 2

Load(A0)

Cache Data Array

Get(A0)

Get(A0)

Get(A0)Response(A0,D0)

Clean,A0 D0

Response(D0)

Cache Controller Refinement Map(conceptual)

function HLM_STATE RM(); // refinement map function HLM_STATE HLM; for (int i=0 ;i < CACHE_SIZE; i++) begin HLM.CacheArray[i].State = RTL.AddrArray[i].State; HLM.CacheArray[i].Addr = RTL.AddrArray[i].Addr; HLM.CacheArray[i].Data = RTL.DataArray[i]@+1; end; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu; return(HLM);end;

<signal>@k denotes the value <signal> will have k clock cycles in the future(k can be negative too)

connecting high level models and rtl: an ongoing battle jesse bingham intel feb 25 2009

Documents