Hardware Modeling
VHDL Synthesis
Vienna University of Technology Department of Computer Engineering
ECS Group
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
2
Synchronous Design Style
Single central clock signal All events triggered by this clock Powerful toolchains available Nearly all commercial circuits implemented in synchronous style
3
4
The Problem
SRC SNK
f(x)
When is the data valid and consistent?
When has the sink consumed the data?
When may the sink use the data?
When to issue the next data item?
Setup- and Hold-Time
5
How long must the data be stable before the active clock edge? – Setup time (tsu) How long must the data be valid after the active clock edge? – Hold time (th) How long does the data need to reach the sink? - Settling time
6
Synchronous Timing
Period Active clock edge
Settling time * clock to output delay * combinational delay * routing delay, …
Setup/hold window
7
Timing Analysis
Possible only at the END of the design-flow (Large iteration loop!) Enormous complexity Only feasible with ideal clock net
Design-Entry
Technol.-Mapping
Partition, Place & Route
Manufact. / Download
Specification
Validation
Behavioral Simulation
Postlayout-GL-Simulation
Prelayout-GL-Simulation
Test
Compilation
Functional Simulation
Timing Analysis
8
tPD,CLK
CLK
D
CLK
D
CLK
D
CLK
D
…
tdly,DATA,1m
tdly,DATA,2m
tdly,DATA,km
FF1
FF2
FFk FFm combin. logic
Synchronous Design - Problems
Slowest path determines clock speed All activity within a short period of time (EMI, power consumption, …) Complex timing analysis needed Single violation may lead to a system failure (no graceful degradation)
9
Example: Coding a D-FlipFlop
10
architecture beh of d_ff is begin process(sys_clk) if rising_edge(sys_clk) then q <= d; end if; end process; end architecture beh;
entity d_ff is port ( sys_clk : in std_logic; d : in std_logic; q : out std_logic ); end entity d_ff;
D Q
Clock Enable
How to prevent an output change at an active clock edge? Gated clock Bad design style (at least for FPGAs) Clock enable signal Not part of the clock net Direct implementation vs. multiplexer
11
Example: Adding clock enable
12
architecture beh of d_ff is begin process(sys_clk) if rising_edge(sys_clk) then if sys_clk_en = ‘1’ then q <= d; end if; end if; end process; end architecture beh;
entity d_ff is port ( sys_clk : in std_logic; sys_clk_en : in std_logic; d : in std_logic; q : out std_logic ); end entity d_ff;
D Q EN
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
13
External Inputs
May change their value at arbitrary times (asynchronous) Synchronizer necessary Number of stages depend on required
dependability For uncritical systems: Two-flop
synchronizer state of the art
14
Example: Coding a Synchronizer
15
architecture beh of sync is signal sync : std_logic_vector(1 to SYNC_STAGES); begin process(sys_clk, sys_res_n) if sys_res_n = ‘0’ then sync <= (others => ‘0’); elsif rising_edge(sys_clk) then sync(1) <= input; for i in 2 to SYNC_STAGES loop sync(i) <= sync(i – 1); end loop; end if; end process; output <= sync(SYNC_STAGES); end architecture beh;
entity sync is generic ( SYNC_STAGES : integer := 2 ); port ( sys_clk : in std_logic; sys_res_n : in std_logic; input : in std_logic; output : out std_logic ); end entity sync;
Switches and Buttons
16
Consider the following circuit: Vcc
Counter in (sync.)
How many events does the counter see, if the button is pressed once?
Switches and Buttons
17
Consider the following circuit: Vcc
Counter in (sync.)
How many events does the counter see, if the button is pressed once? More or equal than one! Cause: bouncing
Bouncing
18
Imperfect switching! Vcc
Up to 10 cycles !!
t
V T = 1μs .. 100 μs
19
Debouncing Buttons/Switches must be debounced
clk
Vcc
D D FSM
in
Synchronizer Debouncing
val
Reset
Two different kinds of reset Synchronous Reset Asynchronous Reset Reset Polarity Active high Active low
20
Reset
Reset = external input Synchronization necessary If a button is used Debouncing necessary Synchronization & Debouncing normally implemented in the top level module Use two-flop synchronizer without reset
21
D-FlipFlop with Async. Reset
22
architecture beh of d_ff is begin process(sys_clk, sys_res_n) if sys_res_n = ‘0’ then q <= ‘0’; elsif rising_edge(sys_clk) then if sys_clk_en = ‘1’ then q <= d; end if; end if; end process; end architecture beh;
entity d_ff is port ( sys_clk : in std_logic; sys_res_n : in std_logic; sys_clk_en : in std_logic; d : in std_logic; q : out std_logic ); end entity d_ff;
D Q EN
RES
D-FlipFlop with Sync. Reset
23
architecture beh of d_ff is begin process(sys_clk) if rising_edge(sys_clk) then if sys_clk_en = ‘1’ then if sys_res_n = ‘0’ then q <= ‘0’; else q <= d; end if; end if; end if; end process; end architecture beh;
entity d_ff is port ( sys_clk : in std_logic; sys_res_n : in std_logic; sys_clk_en : in std_logic; d : in std_logic; q : out std_logic ); end entity d_ff;
D Q EN
D 0
RES
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
24
Two Process Method
Separate combinational and sequential logic Asynchronous process: combinational logic Synchronous process: registers Combinational logic calculates next value Synchronous process stores results
25
Synchronous Process
26
process(sys_clk, sys_res_n) begin if sys_res_n = ‘0‘ then -- Set reset values elsif rising_edge(sys_clk) then -- Store next values end if; end process;
Stores next values into the registers Reset handling (sync. or async.) Sensitivity List: only clock and reset signal
Asynchronous Process
NO edge triggered elements! Sensitivity list: Contains all read signals Use default values to prevent latches Use only „stable“ signals (register outputs) Use short signal paths only
27
28
Example: Counter
process(sys_clk, sys_res_n) begin if sys_res_n = ‘0‘ then count <= (others =>‘0‘); elsif rising_edge(sys_clk) then count <= count_next; end if; end process;
sys_clk
sys_res_n
FF count
count_next
Synchronous process process(up, count) begin count_next <= count; if (up = ‘1‘) then count_next <= std_logic_vector( unsigned(count) + 1); end if; end process;
Asynchronous process
count
0
up
1
count_next +
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components
29
Finite State Machines (FSM)
Theory Mealy Moore Design (state chart) VHDL coding Three process method
30
31
FSM Principles
Sequence of states Only synchronous state changes allowed State changes based on current state and input signals Output signals depend only on the current state (Moore state machine). Asynchronous path from input to output logic possible (Mealy state machines)
32
Moore-State Machine
D
CLK
Register
Next-State-Logic
D
CLK
D
CLK
Output Logic
Next State Current State Feedback
INPUT
OUTPUT
Calculates the next state based on the current one and the input signals
Calculates the outputs based on the current state
Synchronous state switch
33
Mealy-State Machine
D
CLK
Register
Next-State-Logic
D
CLK
D
CLK
Output Logic
Next State Current State Feedback
INPUT
OUTPUT
Calculates the outputs based on the current state and the input signals
34
State Charts for Moore FSMs
„idle“ „active“ 1X
X1
Condition for state change
(trigger, sleep)
Notation for the state change
conditions (list of input signals)
State („Name“)
State Outputs
idle 0 active 1 State to output
mapping (list of output signals)
35
Example: Burglar Alarm
State Output Condition Next state
off 00 1X0 activated
activated 10 XX1 off
X10 alert
alert 11 XX1 off
off
alert
activated
1X0
Inputs: activate button, door contact, code panel Outputs: activation led, alarm siren
XX1 XX1
X10
00
10
11
FSM VHDL Coding
Three process method Synchronous process Asynchronous next state process Asynchronous output process Use case statement for state differentiation Use enumeration for state names
36
FSM VHDL Template (1)
37
type STATE_TYPE is (AAA, BBB, ....); signal state, state_next : STATE_TYPE;
process(sys_clk, sys_res_n) begin if sys_res_n = ‘0‘ then -- Set reset state state <= AAA; elsif rising_edge(sys_clk) then -- Store next state state <= state_next; end if; end process;
FSM VHDL Template (2)
38
process(state) begin -- Set default values -- for the outputs output1 <= ‘0‘; .... -- Calculate the outputs -- based on the current -- state case state is when AAA => output1 <= ‘1‘; when BBB => .... end case; end process;
process(state, input1, …) Begin -- Set a default value -- for next state state_next <= state; -- Calculate the next -- state case state is when AAA => state_next <= XXX; when BBB => if input1 = ‘1‘ then state <= CCC; end if; .... end case; end process;
FSM VHDL Example (1)
39
type STATE_TYPE is (OFF, ACTIVATED, ALERT); signal state, state_next : STATE_TYPE;
process(sys_clk, sys_res_n) begin if sys_res_n = ‘0‘ then -- Set reset state state <= OFF; elsif rising_edge(sys_clk) then -- Store next state state <= state_next; end if; end process;
FSM VHDL Example (2)
40
process(state, active_btn, door_contact, code_panel) Begin -- Set a default value for next state state_next <= state; -- Calculate the next state case state is when OFF => if active_btn = ‘1’ and code_panel = ‘0’ then state_next <= ACTIVATED; end if; when ACTIVATED => if code_panel = ‘1’ then state_next <= OFF; elsif door_contact = ‘1‘ then state_next <= ALERT; end if; when ALERT => if code_panel = ‘1’ then state_next <= OFF; end if; end case; end process;
FSM VHDL Example (3)
41
process(state) begin -- Set default values for the outputs activation_led <= ‘0’; alarm_siren <= ‘0’; -- Calculate the outputs based on the current state case state is when OFF => null; when ACTIVATED => activation_led <= ‘1’; when ALERT => activation_led <= ‘1’; alarm_siren <= ‘1’; end case; end process;
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
42
Platform Specific Components PLL, PCI-Express endpoints, ...
Encapsulate behind common entity (wrapper) Use configuration to select right implementation Use documentation examples/wizards to instantiate the component in your wrapper 43
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
44
Memory Instances
ROM Asynchronous Synchronous RAM Single Port (rw) Dual Port (r/rw) Triple Port (r/r/w)
45
Asynchronous ROM
46
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity async_rom is port ( address : in std_logic_vector(7 downto 0); data : out std_logic_vector(7 downto 0) ); end entity async_rom;
Asynchronous ROM
47
architecture beh of async_rom is subtype ROM_ENTRY_TYPE is std_logic_vector(7 downto 0); type ROM_TYPE is array (0 to (2 ** 8) – 1) of ROM_ENTRY_TYPE; constant rom : ROM_TYPE := ( 0 => x”FF”, 1 => x”AA”, 30 => x”55”, ..., others => x”00” ); begin process(address) begin data <= rom(to_integer(unsigned(address))); end process; end architecture beh;
Synchronous ROM
48
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity sync_rom is port ( clk : in std_logic; address : in std_logic_vector(7 downto 0); data : out std_logic_vector(7 downto 0) ); end entity sync_rom;
Synchronous ROM
49
architecture beh of sync_rom is subtype ROM_ENTRY_TYPE is std_logic_vector(7 downto 0); type ROM_TYPE is array (0 to (2 ** 8) – 1) of ROM_ENTRY_TYPE; constant rom : ROM_TYPE := ( 0 => x”FF”, 1 => x”AA”, 30 => x”55”, ..., others => x”00” ); begin process(clk) begin if rising_edge(clk) then data <= rom(to_integer(unsigned(address))); end if; end process; end architecture beh;
Synchronous Single Port RAM
50
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity sp_ram is generic ( ADDR_WIDTH : integer range 1 to integer‘high; DATA_WIDTH : integer range 1 to integer‘high ); port ( clk : in std_logic; address : in std_logic_vector(ADDR_WIDTH - 1 downto 0); data_out : out std_logic_vector(DATA_WIDTH - 1 downto 0); wr : in std_logic; data_in : in std_logic_vector(DATA_WIDTH - 1 downto 0) ); end entity sp_ram;
Synchronous Single Port RAM
51
architecture beh of sp_ram is subtype RAM_ENTRY_TYPE is std_logic_vector(DATA_WIDTH - 1 downto 0); type RAM_TYPE is array (0 to (2 ** ADDR_WIDTH) – 1) of RAM_ENTRY_TYPE; signal ram : RAM_TYPE := (others => x”00”); begin process(clk) begin if rising_edge(clk) then data_out <= ram(to_integer(unsigned(address))); if wr = ‘1’ then ram(to_integer(unsigned(address))) <= data_in; end if; end if; end process; end architecture beh;
Synchronous Dual Port RAM
52
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity dp_ram is generic ( ADDR_WIDTH : integer range 1 to integer‘high; DATA_WIDTH : integer range 1 to integer‘high ); port ( clk : in std_logic; address1 : in std_logic_vector(ADDR_WIDTH - 1 downto 0); data_out1 : out std_logic_vector(DATA_WIDTH - 1 downto 0); wr1 : in std_logic; data_in1 : in std_logic_vector(DATA_WIDTH - 1 downto 0); address2 : in std_logic_vector(ADDR_WIDTH - 1 downto 0); data_out2 : out std_logic_vector(DATA_WIDTH - 1 downto 0) ); end entity dp_ram;
Synchronous Dual Port RAM
53
architecture beh of dp_ram is subtype RAM_ENTRY_TYPE is std_logic_vector(DATA_WIDTH - 1 downto 0); type RAM_TYPE is array (0 to (2 ** ADDR_WIDTH) – 1) of RAM_ENTRY_TYPE; signal ram : RAM_TYPE := (others => x”00”); begin process(clk) begin if rising_edge(clk) then data_out1 <= ram(to_integer(unsigned(address1))); data_out2 <= ram(to_integer(unsigned(address2))); if wr1 = ‘1’ then ram(to_integer(unsigned(address1))) <= data_in1; end if; end if; end process; end architecture beh;
Synchronous Triple Port RAM
54
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity tp_ram is generic ( ADDR_WIDTH : integer range 1 to integer‘high; DATA_WIDTH : integer range 1 to integer‘high ); port ( clk : in std_logic; address1, address2, address3 : in std_logic_vector(ADDR_WIDTH - 1 downto 0); data_in1 : in std_logic_vector(DATA_WIDTH - 1 downto 0); wr1 : in std_logic; data_out2, data_out3 : out std_logic_vector(DATA_WIDTH - 1 downto 0) ); end entity tp_ram;
Synchronous Triple Port RAM
55
architecture beh of tp_ram is subtype RAM_ENTRY_TYPE is std_logic_vector(DATA_WIDTH - 1 downto 0); type RAM_TYPE is array (0 to (2 ** ADDR_WIDTH) – 1) of RAM_ENTRY_TYPE; signal ram : RAM_TYPE := (others => x”00”); begin process(clk) begin if rising_edge(clk) then data_out2 <= ram(to_integer(unsigned(address2))); data_out3 <= ram(to_integer(unsigned(address3))); if wr1 = ‘1’ then ram(to_integer(unsigned(address1))) <= data_in1; end if; end if; end process; end architecture beh;
Contents
Synchronous design style Reset and external inputs Two process method State machines Platform specific components Memory instances Common synthesis pitfalls
56
Common Synthesis Pitfalls
Complexity Latches
57
Complexity
and add mul div mod
58
Complexity
59
architecture beh of operation is begin o <= a OPERATION b; end architecture beh;
entity operation is port ( a, b: in std_logic_vector(31 downto 0); o : out std_logic_vector(31 downto 0) ); end entity operation;
Complexity - AND
o <= a and b;
60
Complexity - ADD
o <= a + b;
61
Complexity - MUL
o <= a * b;
62
Complexity - DIV
o <= a / b;
63
Complexity - MOD
o <= a mod b;
64
Complexity - Results
65
Operation Estim. Frequency # LUTs #DSP Blocks à 8 9-bit multiplier
and 845 MHz 32 0 add 320 MHz 32 0 mul 250 MHz 0 1 div 14 MHz 2152 0 mod 7 MHz 2220 0
Latches
66
process(i1, i2, i3) begin if i1 = ‘1’ and i2 = ‘1’ then o <= i3; end if; end process;
Latches
67
process(i1, i2, i3) begin if i1 = ‘1’ and i2 = ‘1’ then o <= i3; end if; end process;
Latches
68
process(i1, i2, i3) begin if i1 = ‘1’ and i2 = ‘1’ then o <= i3; end if; end process;
Latches
69
process(i1, i2, i3) begin o <= ‘0’; if i1 = ‘1’ and i2 = ‘1’ then o <= i3; end if; end process;
Summary
Synchronous design style Single clock Timing analysis Reset and external inputs Synchronizer Debouncing Two process method Separate sequential and combinational logic
70
Summary
State machines Sequence of states Synchronous state change Mealy vs. Moore Three process method Platform specific components Wrapper Use wizards
71
Summary
Memory instances Common synthesis pitfalls Complexity Latches
72