Bluespec
Lectures 3 & 4with some slides from Nikhil Rishiyur at Bluespecand Simon Moore at the University of Cambridge
Course Resources
• http://cas.ee.ic.ac.uk/~ssingh • Lecture notes (Power Point, PDF)• Example Bluespec programs used in Lectures• Complete Photoshop system (Bluespec)• Links to Bluespec code samples• User guide, reference guide: doc sub-
directory of Bluespec installation • More information at http://bluespec.com
Rules, not clock edges
• rules are atomic– they execute within one clock cycle
• structure:rule name (explicit conditions)
statements;endrule
• conditions:– explicit – conditions (Boolean expression) provided– implicit – conditions that have to be met to allow the
statements to fire, e.g. for fifo.enq only if fifo not full
Rules: powerfulalternative to always blocks
• rules for state updates instead of always blocks• Simple concept: think if…then…
• Rule can execute (or “fire”) only when its conditions are TRUE• Every rule is atomic with respect to other rules• Powerful ramifications:
– Executable specification – design around operations as described in specs– Atomicity of rules dramatically reduces concurrency bugs– Automates management of shared resources – avoids many complex
errors
rule ruleName (<boolean cond>); <state update(s)>
endrule
Bits, Bools and conversion• Bit#(width)
– vector of bits
• Bool– single bit for Booleans (True, False)
• pack()– function to convert most things (pack) into a bit representation
• unpack()– opposite of pack()
• extend()– extend an integer (signed, unsigned, bits)
• truncate()– truncate an integer
Reg and Bit/Uint/Int types• registers (initialised and uninitialised versions):
Reg#(type) name0 <- mkReg(initial_value);Reg#(type) name1 <- mkRegU;
• some types (unsigned and signed integer, and bits):UInt#(width), Int#(width), Bit#(width)
• example:Reg#(UInt#(8)) counter <- mkReg(0);rule count_up;
counter <= counter+1;endrule
name of module to “make”(i.e. instantiate)
N.B. modules are typically prefixed “mk”interface type
type parameter (e.g. UInt#(8))
since Reg is generic
Registers
interface Reg#(type a); method Action _write (a x1); method a _read ();endinterface: Reg• Polymorphic• Just library elements• In one cycle register reads must execute before
register writes• x <= y + 1 is syntactic sugar for
x._write (y._read + 1)
Scheduling Annotations
C Conflict
CF Conflict free
SB Sequence before
SBR Sequence before restricted(cannot be in the same rule)
SA Sequence after
SAR Sequence after restricted(cannot be in the same rule)
Scheduling Annotations for a Register
read writeread CF SB
write SA SBR
• Two read methods would be conflict-free (CF), that is, you could have multiple methods that read from the same register in the same rule, sequenced in any order.
• A write is sequenced after (SA) a read.• A read is sequenced before (SB) a write.• If you have two write methods, one must be sequenced before the other,
and they cannot be in the same rule, as indicated by the annotation SBR.
Updating Registers
Reg#(int) x <- mkReg (0) ;
rule countup (x < 30); int y = x + 1; x <= x + 1; $display ("x = %0d, y = %0d", x, y);endrule
Rules of Rules (The Three Basics)
1. Rules are atomic 2. Rules fire or don’t at most once per cycle 3. Rules don’t conflict with other rules
x
y+1Q
D
D
Q +1
clk
rule r1; x <= y + 1; endrule
rule r2; y <= x + 1; endrule
x2
y2+1Q
D
D
Q +1
clk
(* synthesize *)module rules4 (Empty);
Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100);
rule r1; x <= y + 1; endrule
rule r2; y <= x + 1; endrule
rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule
endmodule$ ./rules4 -m 5x, y = 10, 100x, y = 10, 11x, y = 10, 11x, y = 10, 11
$ ./rules5 -m 5x, y = 10, 100x, y = 101, 11x, y = 12, 102x, y = 103, 13
x
y+1Q
D
D
Q +1
clk
(* synthesize *)module rules5 (Empty);
Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100);
rule r ; x <= y + 1; y <= x + 1; endrule
rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule
endmodule
x2
y2+1Q
D
D
Q +1
clk
(* synthesize *)module rules6 (Empty);
Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100);
rule r1; x <= y + 1; endrule
rule r2; y <= x + 1; endrule
(* descending_urgency = "r1, r2" *)
rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule
endmodule$ ./rules6 -m 5x, y = 10, 100x, y = 101, 100x, y = 101, 100x, y = 101, 100
interface Rules7_Interface ; method int readValue ; method Action setValue (int newXvalue) ; method ActionValue#(int) increment ;endinterface
(* synthesize *)module rules7 (Rules7_Interface);
Reg#(int) x <- mkReg (0);
method readValue ; return x ; endmethod
method Action setValue (int newXvalue); x <= newXvalue ; endmethod
method ActionValue#(int) increment ; x <= x + 1 ; return x ; endmethod
endmodule
interface Rules7_Interface ; (* always_ready *) method int readResult ; (* always_enabled *) method Action setValues (int newX, int newY, int newZ) ;endinterface
(* synthesize *)module rules7 (Rules7_Interface) ;
Reg#(int) x <- mkReg (0) ; Reg#(int) y <- mkReg (0) ; Reg#(int) z <- mkReg (0) ; Reg#(int) result <- mkRegU ;
Reg#(Bool) b <- mkReg (False) ;
rule toggle ; b <= !b ; endrule
rule r1 (b) ; result <= x * y ; endrule
rule r2 (!b) ; result <= x * z ; endrule
method readResult = result ;
method Action setValues (int newX, int newY, int newZ) ; x <= newX ; y <= newY ; z <= newZ ; endmethod
endmodule
// remaining internal signals assign x_MUL_y___d8 = x * y ; assign x_MUL_z___d5 = x * z ;
interface Rules8_Interface ; (* always_ready *) method int readResult ; (* always_enabled *) method Action setValues (int newX, int newY, int newZ) ;endinterface
(* synthesize *)module rules8 (Rules8_Interface) ;
Reg#(int) x <- mkReg (0) ; Reg#(int) y <- mkReg (0) ; Reg#(int) z <- mkReg (0) ; Wire#(int) t <- mkWire ; Reg#(int) result <- mkRegU ;
Reg#(Bool) b <- mkReg (False) ;
rule toggle ; b <= !b ; endrule
rule computeT ; if (b) t <= y ; else t <= z ; endrule
rule r1 (b) ; result <= x * t ; endrule
method readResult = result ;
method Action setValues (int newX, int newY, int newZ) ; x <= newX ; y <= newY ; z <= newZ ; endmethod
endmodule
// inlined wires assign t$wget = b ? y : z ;
…
// remaining internal signals assign x_MUL_t_wget___d6 = x * t$wget ;
High Level Synthesis
• Most work on high level synthesis focuses on the automation scheduling and allocation to achieve resource sharing.
• Perspective: high level synthesis in general applies to many aspects of converting high level descriptions into efficient circuits but there has been an undue level of effort on resource sharing in an ASIC context.
• Bluespec automates many aspects of scheduling (it makes scheduling composable) but resource usage is under the explicit control of the designer.
• For FPGA-based design this is often a better bit as a programming model.
Simple example withconcurrency and shared resources
Process 0: increments register x when cond0Process 1: transfers a unit from register x to register y when cond1Process 2: decrements register y when cond2
Each register can only be updated by one process on each clock. Priority: 2 > 1 > 0
Just like real applications, e.g.: Bank account: 0 = deposit to checking, 1 = transfer from checking to
savings, 2 = withdraw from savings
0 1 2x y
+1 -1 +1 -1
Process priority: 2 > 1 > 0
cond0 cond1 cond2
Fundamentally, we are scheduling three potentially concurrent atomic transactions that share resources.
What if the priorities changed: cond1 > cond2 > cond0?What if the processes are in different modules?
0 1 2x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @(posedge CLK) begin if (cond2) y <= y – 1; else if (cond1) begin y <= y + 1; x <= x – 1; end
if (cond0 && !cond1) x <= x + 1;end
* There are other ways to write this RTL, but all suffer from same analysis
Resource-access scheduling logic i.e., control logic
always @(posedge CLK) begin if (cond2) y <= y – 1; else if (cond1) begin y <= y + 1; x <= x – 1; end
if (cond0 && (!cond1 || cond2) ) x <= x + 1;end
Better scheduling
With Bluespec, the design is direct
(* descending_urgency = “proc2, proc1, proc0” *)
rule proc0 (cond0); x <= x + 1;endrule
rule proc1 (cond1); y <= y + 1; x <= x – 1;endrule
rule proc2 (cond2); y <= y – 1;endrule
Hand-written RTL:Explicit scheduling Complex clutter, unmaintainable
BSV:Functional correctness follows directly from rule semantics (atomicity)
Executable spec (operation-centric)
Automatic handling of shared resource control logic
Same hardware as the RTL
0 1 2x y
+1 -1 +1 -1
Process priority: 2 > 1 > 0
cond0 cond1 cond2
Now, let’s make a small change: add a new process and insert its priority
01
2
x y
+1
-1 +1
-1
Process priority: 2 > 3 > 1 > 0
cond0 cond1 cond2
3+2 -2
cond3
Process priority: 2 > 3 > 1 > 0
Changing the Bluespec design
01
2
x y
+1
-1 +1
-1
cond0 cond1 cond2
3+2 -2
cond3
(* descending_urgency = “proc2, proc1, proc0” *)
rule proc0 (cond0); x <= x + 1;endrule
rule proc1 (cond1); y <= y + 1; x <= x – 1;endrule
rule proc2 (cond2); y <= y – 1;endrule
(* descending_urgency = "proc2, proc3, proc1, proc0" *) rule proc0 (cond0); x <= x + 1;endrule rule proc1 (cond1); y <= y + 1; x <= x - 1;endrule rule proc2 (cond2); y <= y - 1; x <= x + 1;endrule rule proc3 (cond3); y <= y - 2; x <= x + 2;endrule
Pre-Change
?
Process priority: 2 > 3 > 1 > 0
Changing the Verilog design
01
2
x y
+1
-1 +1
-1
cond0 cond1 cond2
3+2 -2
cond3
always @(posedge CLK) begin if (!cond2 && cond1) x <= x – 1; else if (cond0) x <= x + 1;
if (cond2) y <= y – 1; else if (cond1) y <= y + 1;end
always @(posedge CLK) begin if ((cond2 && cond0) || (cond0 && !cond1 && !cond3)) x <= x + 1; else if (cond3 && !cond2) x <= x + 2; else if (cond1 && !cond2) x <= x - 1 if (cond2) y <= y - 1; else if (cond3) y <= y - 2; else if (cond1) y <= y + 1;end
Pre-Change
?
Alternate RTL style (more common)
• Combinatorial explosion• Case 3’b111 is subtle• Many repetitions of update actions
( cut-paste errors)– cf. “WTO Principle” (Write Things
Once—Gerard Berry)• Difficult to maintain/extend• Difficult to modularize
0 1 2x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @ (posedge clk) case ({cond0, cond1, cond2}) 3'b000: begin // nothing happens x <= x; y <= y; end 3'b001: begin //proc2 fires y <= y-1; end 3'b010: begin //proc1 x <= x-1; y <= y+1; end 3'b011: begin //proc2 fires (2>1) y <= y-1; end 3'b100: begin //proc0 x <= x+1; end 3'b101: begin //proc2 + proc0 x <= x+1; y <= y-1; end 3'b110: begin //proc1 (1>0) x <= x-1; y <= y+1; end 3'b111: begin //proc2 + proc0 x <= x+1; // NOTE – subtle! y <= y-1; end endcase
Late Specifications
Late specification changes and feature enhancements are challenging to deal with.
Micro-architectural changes for timing/area/performance, e.g.: Adding a pipeline stage to an existing pipeline Adding a pipeline stage where pipelining was not anticipated Spreading a calculation over more clocks (longer iteration) Moving logic across a register stage (rebalancing) Restructuring combinational clouds for shallower logic
Fixing bugs
Bluespec makes it easier to try out multiple macro/micro-architectures earlier in the design cycle
Why Rule atomicity improves correctness
Correctness is often couched (formally or informally) as an invariant E.g.,
Rule atomicity improves thinking about (and formally proving) invariants, because invariants can be verified one rule at a time
In contrast, in RTL and thread models, must think of all possible interleavings cf. The Problem With Threads, Edward A. Lee, IEEE Computer
39(5), May 2006, pp. 33-42
“# ingress packets — # egress packets == packet-count register value”
Bank Account: Key Benefits
• Executable specifications• Rapid changes• But, with fine-grained control of RTL:
– Define the optimal architecture/micro-architecture
– Debug at the source OR RTL level – designer understands both
– The Quality of Results (QoR) of RTL!
A more complexexample, from CPU design
Speculative, out-of-orderMany, many concurrent activities
Branch
RegisterFile
ALUUnitRe-
OrderBuffer(ROB) MEM
Unit
DataMemory
InstructionMemory
Fetch Decode
FIFO
FIFO FIFO FIFO FIFO
FIFO
FIFOFIFO
FIFOFIFORe-
OrderBuffer(ROB)
Branch
RegisterFile
ALUUnit
MEMUnit
DataMemory
InstructionMemory
Fetch Decode
33
Many concurrent actions on common state: nightmare to manage explicitly
EmptyWaiting
EW
Head
Tail
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V - -Instr - V -
V 0 -Instr B V 0W
V 0 -Instr C V 0W
-Instr D V 0W
V 0 -Instr A V 0W
V - -Instr - V -
V - -Instr - V -E
E
E
E
E
E
E
E
E
E
E
E
V 0
Re-Order Buffer
Put aninstr into
ROB
DecodeUnit
RegisterFile
Get operandsfor instr
Writebackresults
Get a readyALU instr
Get a readyMEM instr
Put ALU instr results in ROB
Put MEM instr results in ROB
ALUUnit(s)
MEMUnit(s)Resolve
branches
Operand 1 ResultInstruction Operand 2State
Branch Resolution• …• …• …
Commit Instr• Write results to registerfile (or allow memorywrite for store)• Set to Empty• Increment head pointer
Write Back Results to ROB• Write back results toinstr result• Write back to all waitingtags• Set to done
Dispatch Instr• Mark instructiondispatched• Forward to appropriateunit
In Bluespec…
..you can code each operation in isolation, as a rule
..the tool guarantees that operations are INTERLOCKED (i.e. each runs to completion without external interference)
Insert Instr in ROB• Put instruction in firstavailable slot• Increment tail pointer• Get source operands
- RF <or> prev instr
Which oneis correct?
What’s required to verify that they’re correct?What if the priorities changed: cond1 > cond2 > cond0?What if the processes are in different modules?
always @(posedge CLK) begin if (!cond2 || cond1) x <= x – 1; else if (cond0) x <= x + 1;
if (cond2) y <= y – 1; else if (cond1) y <= y + 1;end
0 1 2x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @(posedge CLK) begin if (!cond2 && cond1) x <= x – 1; else if (cond0) x <= x + 1;
if (cond2) y <= y – 1; else if (cond1) y <= y + 1;end
Some Verilog solutions
Functional code and scheduling code are deeply (inextricably) intertwined.
What’s required to verify that they’re correct?What if the priorities changed: cond1 > cond2 > cond0?What if the processes are in different modules?
always @(posedge CLK) begin if (!cond2 || cond1) x <= x – 1; else if (cond0) x <= x + 1;
if (cond2) y <= y – 1; else if (cond1) y <= y + 1;end
0 1 2x y
+1 -1 +1 -1
always @(posedge CLK) begin if (!cond2 && cond1) x <= x – 1; else if (cond0) x <= x + 1;
if (cond2) y <= y – 1; else if (cond1) y <= y + 1;end
Which oneis correct?
Process priority:2 > 1 > 0
cond0 cond1 cond2
37
Finite State Machines in Bluespec
for makigncomposable, parallel, nested, suspendable/abortable FSMs
Features:• FSMs automatically synthesized• Complex FSMs expressed succinctly• FSM actions have same atomic semantics as BSV rule bodies
• Well-behaved on shared resources—no surprises• Standard BSV interfaces and BSV’s higher-order functions can write your
own FSM generators
fsm
sequentialloops
fsm fsm
sequencing
fsm
fsm
fsm
fsm
if-then-else parallel FSMs(fork-join)
fsm
fsmfsm
hierarchy(with suspend and abort)
This powerful capability is enabled by higher-order functions, polymorphic types, advanced parameterization and atomic transactions
Enables exponentially smallerdescriptions compared to flat FSMs
38
FSM example (from testbench stimulus section)
Stmt s = seq action rand_packets0.init; rand_packets1.init; endaction par for (j0 <= 0; j0 < n; j0 <= j0 + 1) action let pkt0 <- rand_packets0.next; switch.ports[0].put (pkt0); endaction for (j1 <= 0; j1 < n; j1 <= j1 + 1) action let pkt1 <- rand_packets1.next; switch.ports[1].put (pkt1); endaction endpar drain_switch; endseq;
FSM fsm <- mkFSM (s);
rule go; s.start;endrule
Basic FSM statements are “Actions”, just like rule bodies, and have exactly the same atomic semantics. Thus, BSV FSMs are well-behaved with respect to concurrent resource contention and flow control.
39
Strong support for multiple clock and reset domains
• Rich and mature support for MCD (multiple clock domains and resets)
• Clock is a first-class data type• Cannot accidentally mix clocks and ordinary signals• Strong static checking ensures that it is impossible to
accidentally cross clock domain boundaries (i.e., without a synchronizer)• No need for linting tools to check domain discipline
• Clock manipulation• Clocks can be passed in and out of module interfaces• Library of clock dividers and other transformations• Module instantiation can specify an alternative clock (instead
of inheriting parent’s default clock)
• (Similarly: Reset and reset domains)
Synthesis of Atomic Actions
state
ComputePredicates
for each rule
Compute next state
for each rule
scheduler
SelectorMux’s & priority
encoders
read
p3
p2
p1
d1
d2
d3
f1 f2 f3
update
Predicates computed for each rule with a combinationalcircuit
Select maximal subset of applicable rules
enabled rules
Potential update functions
Key Issue: How to select to maximal subset of rules for firing?
• Two rules R1 and R2 can execute simultaneously if they are “conflict free” i.e.– R1 and R2 do not update the same state; and– Neither R1 or R2 do not read the that the other
updates (“sequentially composable” rules)
Rules of Rules (The Details 1-5/10)1. Rules are atomic: rules fire completely or not at all, and you can imagine that
nothing else happens during their execution. 2. Explicit and implicit conditions may prevent rules from firing.3. Every rule fires exactly 0 or 1 times every cycle (at this point in our product's
history anyway ;) 4. Rules that conflict in some way may fire together in the same cycle, but only
if the compiler can schedule them in a valid order to do so -- that is, where the overall effect is as if they had happened one at at time as in (1) above.
5. Rules determine if they are going to fire or not before they actually do so. They are considered in their order of "urgency" (by a "greedy algorithm"): they "will fire" if they "can fire" and are not prevented by a conflict with a rule which has been selected already. It's OK to think of this phase as being completed (except for wires) before any rules are actually executed. This is what "urgency" is about.
Rules of Rules (The Details 6-10/10)
6. After determining which rules are going to fire, the simulator can then schedule their execution. (In hardware it's all done by combinational logic which has the same effect.) Rules do not need to execute in the same order as they were considered for deciding whether they "will fire". For example rule1 can have a higher urgency than rule2, but it is possible that rule2 executes its logic before rule1. Urgency is used to determine which rules "will fire“. Earliness defines the order they fire in.
7. All reads from a register must be scheduled before any writes to the same register: any rule which reads from a register must be scheduled "earlier" than any other rule which writes to it.
8. Constants may be "read" at any time; a register *might* have a write but no read.9. The compiler creates a sequence of steps, where each step is essentially a rule firing. Its
inputs are valid at the beginning of the cycle, its outputs are valid at the end of the cycle. Data is not allowed to be driven "backwards" in the schedule: that is, no action may influence any action that happened "earlier" in the cycle. This would go against causality, and constitutes a "feedback" path that the compiler will not allow.
10. If the compiler is not told otherwise, methods have higher urgency than rules, and will execute earlier than rules, unless there's some reason to the contrary. There is a compiler switch to flip this around and make rules have higher urgency.
The Swap Conundrum(* synthesize *)module rules9 (Empty) ;
Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ;
rule r1 ; x <= y ; endrule
rule r2 ; y <= x ; endrule
rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule
endmodule
$ ./rules9 -m 5x, y = 12, 17x, y = 12, 12x, y = 12, 12x, y = 12, 12
The Swap Conundrum(* synthesize *)module rules9 (Empty) ;
Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ;
rule r1 ; x <= y ; endrule
rule r2 ; y <= x ; endrule
rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule
endmodule
rule r1 (tick 1) x._write (y._read ()) y readx write
rule r2 (tick 2) y._write(x._read()) x ready write
PROBLEM: register x must read before write
(* synthesize *)module rules10 (Empty) ;
Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ;
rule r ; x <= y ; y <= x ; endrule
rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule
endmodule
$ ./rules10 -m 5x, y = 12, 17x, y = 17, 12x, y = 12, 17x, y = 17, 12
Schedule wise, step 1 reads x and y at the beginning and writes x and y at the end.
Wires
• In Bluespec from a scheduling perspective registers and wires are dual concepts.
• In one cycle all register reads must execute before register writes.
• In one cycle a wire must be written to (at most once) before it is read (any number of times).
Rules of Wires
• Wires truly become wires in hardware: they do not save “state” between cycles (compare to signal in VHDL).
• A wire’s schedule requires that it be written before it is read (as opposed to a register that is read before it is written).
• A wire can not be written more than once in a cycle.
(* synthesize *)module rules11 (Empty) ;
Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; Wire#(int) xwire <- mkWire;
rule r1 ; x <= y ; endrule
rule r2 ; y <= xwire ; endrule
rule driveX ; xwire <= x ; endrule
rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule
endmodule
$ ./rules11 -m 5x, y = 12, 17x, y = 17, 12x, y = 12, 17x, y = 17, 12
(* synthesize *)module rules11 (Empty) ;
Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; Wire#(int) xwire <- mkWire;
rule r1 ; x <= y ; endrule
rule r2 ; y <= xwire ; endrule
rule driveX ; xwire <= x ; endrule
rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule
endmodule
$ cat rules11.sched=== Generated schedule for rules11 ===
Rule schedule-------------Rule: monitorPredicate: TrueBlocking rules: (none)
Rule: driveXPredicate: TrueBlocking rules: (none)
Rule: r2Predicate: xwire.whasBlocking rules: (none)
Rule: r1Predicate: TrueBlocking rules: (none)
Logical execution order: monitor, driveX, r1, r2
=======================================
Question: is monitor, driveX, r2, r1 a valid schedule?
Wire
• Implements Reg interface (_read and _write methods).
• Implicit condition:– it not ready if it has not been written
• In any cycle if there is no write to a wire then any rule that reads that wire is blocked (it can not fire).
(* synthesize *)module rules12 (Empty) ;
Reg#(int) y <- mkReg (17) ; Reg#(int) count <- mkReg (0) ; Wire#(int) x <- mkWire;
rule producer ; if (count % 3 == 0) x <= count ; endrule
rule consumer ; y <= x ; $display ("cycle %0d: y set to %0d", count, x) ; endrule
rule counter ; count <= count + 1 ; endrule
endmodule
$ ./rules12 -m 9cycle 0: y set to 0cycle 3: y set to 3cycle 6: y set to 6
DWire
• A Wire with a default value.• A Dwire is always ready.• If there is a write to a DWire in a cycle then
just like a Wire it assumes that value.• If there is no write to a DWire in a cycle it
assumes a default value (given at instantiation time).
(* synthesize *)module rules13 (Empty) ;
Reg#(int) y <- mkReg (17) ; Reg#(int) count <- mkReg (0) ; Wire#(int) x <- mkDWire (42);
rule producer ; if (count % 3 == 0) x <= count ; endrule
rule consumer ; y <= x ; $display ("cycle %0d: y set to %0d", count, x) ; endrule
rule counter ; count <= count + 1 ; endrule
endmodule
$ cycle 1: y set to 42cycle 2: y set to 42cycle 3: y set to 3cycle 4: y set to 42cycle 5: y set to 42cycle 6: y set to 6cycle 7: y set to 42
BypassWire
• Closest thing to a wire in Verilog.• A BypassWire is always ready.• Rather than having a default value the
compiler must be able to statically determine that this wire is driven on every cycle.
FIFOs
• Lots and lots of FIFOs provided in FIFO, FIFOF, SpecialFIFOs libraries
• Examples (2 and 4 element FIFOs):FIFO#(UInt#(8)) myfifo <- mkFIFO; FIFO#(UInt#(8)) biggerfifo <- mkSizedFIFO(4);
• Example BypassFIFO (1 storage element, data passes straight through if enq and deq on same cycle when empty)
FIFO#(UInt#(8)) bypassfifo <- mkBypassFIFO;
• Basic interfaces:– enq(value) // enqueue “value”– first // returns first element of fifo– deq // dequeue
import FIFO::*;
(* synthesize *)module rules14 (Empty) ;
Reg#(int) count <- mkReg (0) ;
FIFO#(int) fifo <- mkSizedFIFO (30);
rule producer (count < 5) ; fifo.enq (count*3) ; $display ("cycle %0d: enqeuing value %d", count, count*3) ; endrule
rule consumer (count > 5) ; int x = fifo.first ; fifo.deq ; $display ("cycle %0d: deqeued value %0d", count, x) ; endrule
rule counter ; count <= count + 1 ; endrule
endmodule
$ ./rules14 -m 20cycle 0: enqeuing value 0cycle 1: enqeuing value 3cycle 2: enqeuing value 6cycle 3: enqeuing value 9cycle 4: enqeuing value 12cycle 6: deqeued value 0cycle 7: deqeued value 3cycle 8: deqeued value 6cycle 9: deqeued value 9cycle 10: deqeued value 12
import FIFO::*;
(* synthesize *)module rules15 (Empty) ;
Reg#(int) count <- mkReg (0) ;
FIFO#(int) fifo <- mkSizedFIFO (30);
rule producer (count < 5) ; fifo.enq (count*3) ; $display ("cycle %0d: enqeuing value %0d", count, count*3) ; endrule
rule consumer (count < 5) ; int x = fifo.first ; fifo.deq ; $display ("cycle %0d: deqeued value %0d", count, x) ; endrule
rule counter ; count <= count + 1 ; endrule
endmodule
?
import GetPut::* ;import Connectable::* ;
module mkProducer (Get#(int)) ;
Reg#(int) i <- mkReg (0) ;
rule incrementI ; i <= i + 1 ; endrule
method ActionValue#(int) get () ; return i ; endmethod endmodule: mkProducer
module mkConsumer (Put#(int)) ;
Wire#(int) i <- mkWire ;
rule report ; $display ("mkConsumer %d", i) ; endrule
method Action put (int x) ; i <= x ; endmethod
endmodule: mkConsumer
(* synthesize *)module mkConnectableExample(Empty) ;
Get#(int) p <- mkProducer ; Put#(int) c <- mkConsumer ; mkConnection (p, c) ;
endmodule: mkConnectableTest
Higher Order Typesp and c are methods whichare passed as arguments
ServerFarm
ServerFarm Information Flow
DividerServer
requ
est
resp
onse
DividerServer
requ
est
resp
onse
resp
onse
requ
est
Conclusions
• Bluespec:– provides cleaner interfaces
• quicker to create large systems from libraries of components• easier to refine design
– creates most of the control for you (unless you don’t want it to)
• less likely to get it wrong!
– has strong typing• helps remove bugs
– provides powerful static elaboration