Course web page:
ECE 545
Digital System Design with VHDL
ECE web page Courses Course web pages ECE 545
http://ece.gmu.edu/coursewebpages/ECE/ECE545/F11/
Kris Gaj
Office hours: Thursday, 7:30-8:30 PM, Tuesday, 7:30-8:30 PM, and by appointment
Research and teaching interests:• reconfigurable computing• computer arithmetic• cryptography• network security
Contact:The Engineering Building, room 3225
ECE 545
Part of:
MS in Electrical Engineering
MS in Computer Engineering
Digital Systems Design
Fundamental course for the specialization area:
Elective
Elective course in the remaining specialization areas
One of five core courses (must be passed with B or better)
ECE 545
Part of:
PhD in Electrical and Computer Engineering
Knowledge tested at the Technical Qualifying Exam (TQE)Topic 2: Digital Design and Computer Organization
I am interested in…
I want to specializeprimarily in…
VLSI
Digital Systems Design
ASICs & FPGAs
VHDL/Verilog
CAD Tools
Reconfigurable Computing
Microelectronics
VLSI Fabrication
Nanoelectronics
CAD tools & Design Automation
Hardware Description Languages
FPGAs & Reconfigurable computing
Computer Arithmetic
Front-end ASIC Design (algorithmic downto gate level)
Back-end ASIC Design (circuit and mask layout levels)
Analog & Digital Circuit Design
VLSI Fabrication
Microelectronics
Nanoelectronics
Semiconductor Devices
MS CpEDigital Systems Design
MS EEMicroelectronics/Nanoelectronics
Recommendedprogram &
specialization
algorithmic
Design level
register-transfer
gate
transistor
layout
devices
CoursesComputerArithmetic
Digital SystemDesign with VHDL
DigitalIntegratedCircuitsPhysical
VLSI Design
VLSI Test Concepts
ECE545
ECE645
ECE 586
ECE 680
ECE682
ECE684 MOS Device Electronics
ECE 584 SemiconductorDevice Fundamentals
ECE681
VLSI Design for ASICs
CpE
Digital Systems Design
Pre-ApprovedElectives
SuggestedElectives
ECE 545 Digital System Design with VHDLECE 586 Digital Integrated CircuitsECE 645 Computer ArithmeticECE 681 VLSI Design for ASICsECE 682 VLSI Test Concepts
ECE 584, 684, … (technology)ECE 511, 611, … (microprocessors)ECE 646, 746, … (applications)
K. Gaj, K. Hintz, H. Homayoun,J. Kaps, T. Storey
CpEMicroprocessors and Embedded Systems
ECE 510 Real-Time ConceptsECE 511 MicroprocessorsECE 611 Advanced MicroprocessorsECE 612 Real-Time Embedded SystemsECE 641 Computer System Architecture
CS 540, 583 (languages, algorithms)CS 635 (parallel machines)ECE 542, 642, 742 (networks)ECE 645, 681 (digital design)ECE 548 (sequential mach. theory)
H. Homayoun, J. Kaps, P. Pachowicz, C. SabzevariProfessors
DIGITAL SYSTEMS DESIGN
Concentration advisors: Kris Gaj, Jens-Peter Kaps, Ken Hintz
1. ECE 545 Digital System Design with VHDL– K. Gaj, project, FPGA design with VHDL,
Aldec/Mentor Graphics, Xilinx/Altera
2. ECE 645 Computer Arithmetic– K. Gaj, project, FPGA design with VHDL
Aldec/Mentor Graphics, Xilinx/Altera
3. ECE 681 VLSI Design for ASICs– H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools
4. ECE 586 Digital Integrated Circuits – D. Ioannou, R. Mulpuri,
5. ECE 682 VLSI Test Concepts – T. Storey
Grading Scheme
• Homework - 10%
• Project - 40%
• Midterm Exam - 20%
• Final Exam - 30%
Midterm exam 1
2 hours 30 minutes
in class
design-oriented
open-books, open-notes
practice exams available on the web
Last week of October
Tentative date:
Final exam
2 hours 45 minutes
in class
design-oriented
open-books, open-notes
practice exams available on the web
Thursday, December 13, 4:30-7:15pm
Date:
12
TextbooksTextbooks
Required TextbookPong P. Chu, RTL Hardware Design Using VHDL,Wiley-Interscience, 2006.
Supplementary Textbook – Basics Refresher
Stephen Brown and Zvonko Vranesic,Fundamentals of Digital Logic with VHDL Design, McGraw-Hill, 3rd or 2nd Edition
Supplementary Textbook – Advanced
Hubert Kaeslin, Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fabrication, Cambridge University Press; 1st Edition, 2008.
Used in ECE 681“VLSI Design for ASICs”
16
Technology&
Tools
Technology&
Tools
Block R
AM
s
Block R
AM
s
ConfigurableLogicBlocks
I/OBlocks
What is an FPGA?
BlockRAMs
FPGA Design process (1)Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..
Library IEEE;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;
entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; );end AES_core;
Specification / Pseudocode
VHDL description (Your Source Files)
Functional simulation
Post-synthesis simulationSynthesis
On-paper hardware design (Block diagram & ASM chart)
FPGA Design process (2)
Implementation
Configuration
Timing simulation
On chip testing
Simulation Tools
FPGA Synthesis Tools
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
beginA1<=A when (NEG_A='0') else
not A;B1<=B when (NEG_B='0') else
not B;Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;MUX_1<=A1 or B1;MUX_2<=A1 xor B1;MUX_3<=A1 xnor B1;
with (L1 & L0) selectY1<=MUX_0 when "00",
MUX_1 when "01",MUX_2 when "10",MUX_3 when others;
end MLU_DATAFLOW;
VHDL description Circuit netlist
Logic Synthesis
FPGA Implementation
• After synthesis the entire implementation process is performed by FPGA vendor tools
Design Process control from Active-HDL
Xilinx FPGA Tools
Aldec Active-HDL (IDE)
Xilinx XST&Synopsys Synplify Premier
Xilinx ISE Design Suite
ECE Labs
Mentor Graphics ModelSim SE
Xilinx XST&Synopsys Synplify Premier
Xilinx ISE Design Suite (IDE)
Aldec Active-HDLDesign Flow
Xilinx ISE Design Flow
simulationsynthesisimplementation
Xilinx FPGA Tools
Aldec Active-HDLStudent Edition (IDE)
Xilinx XST (restricted)
Home
Aldec Active-HDL Design Flow
Xilinx ISE Design Flow
simulationsynthesisimplementation
Xilinx ISE WebPACK(restricted)
Mentor Graphics ModelSim PE Student Edition
Xilinx XST (restricted)
Xilinx ISE WebPACK (IDE)(restricted)
Altera FPGA Tools
ECE Labs
Mentor Graphics ModelSim-Altera
Altera Quartus II Subscription Edition
AlteraDesign Flow
simulationsynthesis & implementation
Altera FPGA Tools
Home
Mentor Graphics ModelSim-Altera Starter(restricted)
Altera Quartus II Web Edition(restricted)
AlteraDesign Flow
simulationsynthesis & implementation
32
ProjectProject
Project
semester-long
related to the research project conducted by Cryptographic Engineering Research Group (CERG) at GMU
supporting NIST (National Institute of Standards and Technology) in the evaluation of candidates for a new cryptographic standard
34
BackgroundBackground
Crypto 101Crypto 101
Cryptography is Everywhere
Buying a book on-line Withdrawing cash from ATM
Teleconferencing over Intranets
Backing up files on remote server
Cryptographic Standards Before 1997
time
1970 1980 1990 2000 2010
DES – Data Encryption Standard
1977 1999
Triple DES
SHA-1–Secure Hash Algorithm
SHA-2
Secret-Key Block Ciphers
Hash Functions 1995 20031993
SHA
2005
NSA
IBM& NSA
Why a Contest for a Cryptographic Standard?
• Avoid back-door theories
• Speed-up the acceptance of the standard
• Stimulate non-classified research on methods of
designing a specific cryptographic transformation
• Focus the effort of a relatively small cryptographic
community
Cryptographic Standard Contests
time
96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13
AES
NESSIE
CRYPTREC
eSTREAM
SHA-3
34 stream ciphers 4 HW winners + 4 SW winners
51 hash functions 1 winner
15 block ciphers 1 winner
IX.1997 X.2000
I.2000 XII.2002
V.2008
X.2007 XII.2012
XI.2004
40
Cryptographic Contests - Evaluation Criteria
Security
Software Efficiency Hardware Efficiency
Simplicity
FPGAs ASICs
Flexibility Licensing
μProcessors μControllers
Specific Challenges of Evaluationsin Cryptographic Contests
• Very wide range of possible applications, and as a result
performance and cost targets
throughput: single Mbits/s to hundreds Gbits/s
cost: single cents to thousands of dollars
• Winner in use for the next 20-30 years, implemented using
technologies not in existence today
• Large number of candidates
• Limited time for evaluation
• Only one winner and the results are final
Mitigating Circumstances
• Security is a primary criterion
• Performance of competing algorithms tend to very significantly
(sometimes as much as 500 times)
• Only relatively large differences in performance matter
(typically at least 20%)
• Multiple groups independently implement the same algorithms
(catching mistakes, comparing best results, etc.)
• Second best may be good enough
AESContest
1997-2000
AESContest
1997-2000
Rules of the Contest
Each team submits
Detailedcipher
specification
Justificationof designdecisions
Tentativeresults
of cryptanalysis
Sourcecodein C
Sourcecode
in Java
Testvectors
AES: Candidate Algorithms
USA: MarsRC6TwofishSafer+HPC
Canada:CAST-256Deal
Costa Rica:Frog
Australia:LOKI97
Japan:E2
Korea:Crypton
Belgium:Rijndael
France:DFC
Germany:Magenta
Israel, UK,Norway:
Serpent
8 42
1
AES Contest Timeline
15 Candidates CAST-256, Crypton, Deal, DFC, E2, Frog, HPC, LOKI97, Magenta, Mars,
RC6, Rijndael, Safer+, Serpent, Twofish,
June 1998
August 1999
October 2000
1 winner: RijndaelBelgium
5 final candidatesMars, RC6, Twofish (USA)Rijndael, Serpent (Europe)
Round 1
Round 2
SecuritySoftware efficiency
SecuritySoftware efficiencyHardware efficiency
Security
Simplicity
High
Adequate
SimpleComplex
NIST Report: Security & Simplicity
MARS
Rijndael
SerpentTwofish
RC6
0
5
10
15
20
25
30
SerpentRijndael TwofishRC6 Mars
Efficiency in software: NIST-specified platform
128-bit key
192-bit key
256-bit key
200 MHz Pentium Pro, Borland C++
Throughput [Mbits/s]
NIST Report: Software Efficiency
Encryption and Decryption Speed
32-bitprocessors
64-bitprocessors
DSPs
high
medium
low
RC6
RijndaelMars
Twofish
Serpent
RijndaelTwofish
MarsRC6
Serpent
RijndaelTwofish
MarsRC6
Serpent
Efficiency in FPGAs: Speed
0
50
100
150
200
250
300
350
400
450
500
Throughput [Mbit/s]
Serpent x8
Rijndael Twofish RC6 MarsSerpent x1
431 444
414
353
294
177173
104
149
62
143
11288
102
61
Worcester Polytechnic Institute
University of Southern California
George Mason University
Xilinx Virtex XCV-1000
0
100
200
300
400
500
600
700
Rijndael Twofish RC6 MarsSerpent x1
606
202
105 10357
443
202
105 10457
3-in-1 (128, 192, 256 bit) key scheduling
128-bit key scheduling
Efficiency in ASICs: Speed
Throughput [Mbit/s]MOSIS 0.5μm, NSA Group
Results for ASICs matched very well results for FPGAs,and were both very different than software
FPGA ASIC
Serpent fastest in hardware, slowest in software
GMU+USC, Xilinx Virtex XCV-1000 NSA Team, ASIC, 0.5μm MOSIS
Lessons Learned
x8
x1x1
Hardware results matter!
Speed in FPGAs Votes at the AES 3 conference
Final round of the AES Contest, 2000
Lessons Learned
GMU results
• Optimization for maximum throughput
• Single high-speed architecture per candidate
• No use of embedded resources of FPGAs (Block RAMs, dedicated multipliers)
• Single FPGA family from a single vendor:
Xilinx Virtex
Limitations of the AES Evaluation
FPGA Evaluations
AES eSTREAM SHA-3
Multiple FPGA families No No Yes
Multiple architectures No Yes Yes
Use of embedded resources
No No Yes
Primary optimization target
Throughput AreaThroughput/
Area
Throughput/Area
Experimental results No No Yes
Availability of source codes
No No Yes
Specialized tools No No Yes
ASIC Evaluations
AES eSTREAM SHA-3
Multiple processes/libraries
No No Yes
Multiple architectures No Yes Yes
Primary optimization target
Throughput Power x Area x Time
Throughput/Area
Post-layout results No Yes Yes
Experimental results No Yes Yes
Availability of source codes
No No Yes
Specialized tools No No No
BenchmarkingTools
BenchmarkingTools
Tools for BenchmarkingImplementations of Cryptography
Software ASICsFPGAs
eBACS
D. Bernstein (UIC)T. Lange (TUE)
?ATHENa
K. Gaj,J. Kaps, et al.(GMU)
2006-present 2009-present
59
Benchmarkingin Software: eBACS
Benchmarkingin Software: eBACS
60
eBACS: ECRYPT Benchmarking of Cryptographic Systems:
• measurements on multiple machines (currently over 90)
• each implementation is recompiled multiple times
(currently over 1600 times) with various compiler options
• time measured in clock cycles/byte for multiple
input/output sizes
• median, lower quartile (25th percentile), and upper quartile
(75th percentile) reported
• standardized function arguments (common API)
SUPERCOP - toolkit developed by D. Bernstein and T. Lange for measuring performance of cryptographic software
http://bench.cr.yp.to/
SUPERCOP Extension for Microcontrollers – XBX: 2009-present
Christian Wenzel-Benner, ITK Engineering AG, Germany
Jens Gräf, LiNetCo GmbH, Heiger, Germany
Developers:
Allows on-board timing measurements
Supports at least the following microcontrollers:
8-bit:Atmel ATmega1284P (AVR)
32-bit:TI AR7 (MIPS)Atmel AT91RM9200 (ARM 920T)Intel XScale IXP420 (ARM v5TE)Cortex-M3 (ARM)
62
Benchmarkingin FPGAs: ATHENa
Benchmarkingin FPGAs: ATHENa
ATHENa – Automated Tool for Hardware EvaluatioN
63
Open-source benchmarking environment, written in Perl, aimed at
AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms.
The most recent version0.6.2 released in June 2011.Full features in ATHENa 1.0
to be released in 2012.
http://cryptography.gmu.edu/athena
Why Athena?
64
"The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.”
from "Athena, Greek Goddess of Wisdom and Craftsmanship"
ATHENaServer
FPGA Synthesis and Implementation
Result Summary+ Database Entries
2 3
HDL + scripts + configuration files
1
Database Entries
Download scripts and
configuration files8
Designer
4
HDL + FPGA Tools
User
Databasequery
Ranking of designs
5
6
Basic Dataflow of ATHENa
0
Interfaces+ Testbenches 65
Three Components of the ATHENa Environment
• ATHENa Tool
• ATHENa Database of Results
• ATHENa Website
67
ATHENa – Databaseof Results
ATHENa – Databaseof Results
68
ATHENa Databasehttp://cryptography.gmu.edu/athenadb
69
ATHENa Database – Result View• Algorithm parameters• Design parameters
Optimization target Architecture type Datapath width I/O bus widths Availability of source code
Platform Vendor, Family, Device
Timing Maximum clock frequency Maximum throughput
Resource utilization Logic blocks (Slices/LEs/ALUTs) Multipliers/DSP units
Tools Names & versions Detailed options
Credits Designers & contact information
70
ATHENa Database – Compare Feature
Matching fields in greyNon-matching fields in red and blue
71
ATHENa - WebsiteATHENa - Website
72
ATHENa Websitehttp://cryptography.gmu.edu/athena/
• Download of ATHENa Tool
• Links to related tools
SHA-3 Competition in FPGAs & ASICs
• Specifications of candidates
• Interface proposals
• RTL source codes
• Testbenches
• ATHENa database of results
• Related papers & presentations
73
ATHENa Result Replication Files
• Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations)
• Automatically created by ATHENa for all results generated using ATHENa
• Stored in the ATHENa Database
In the same spirit of Reproducible Research as:
• Patrick Vandewalle1, Jelena Kovacevic2, and Martin Vetterli1 (1EPFL, 2CMU) Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May 2009. http://rr.epfl.ch/17/
• J. Claerbout (Stanford University)“Electronic documents give reproducible research a new meaning,”
in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92
. . . . .
74
Benchmarking Goals Facilitated by ATHENa
1. cryptographic algorithms
2. hardware architectures or implementations of the same cryptographic algorithm
3. hardware platforms from the point of view of their suitability for the implementation of a given algorithm,(e.g., choice of an FPGA device or FPGA board)
4. tools and languages in terms of qualityof results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 13.1 vs. ISE v. 12.3)
Comparing multiple:
75
Your Project:Implementation and
Benchmarking of Authenticated
Ciphers
Your Project:Implementation and
Benchmarking of Authenticated
Ciphers
Features of Authenticated Ciphers
1. Confidentiality
2. Message integrity
3. Message authentication
Bob Alice
Charlie
Bob Alice
Charlie
Bob Alice
Charlie
All Projects - Organization
• Projects divided into phases
• Deliverables for each phase submitted through Blackboard at selected checkpoints and evaluated by the instructor and/or TA
• Feedback provided to students on a best effort basis
• Final report and codes submitted using Blackboard at the end of the semester
Honor Code Rules
• All students are expected to write and debug their codes individually
• Students are encouraged to help and support each other in all problems related to the- operation of the CAD tools- understanding of an investigated algorithm and existing implementations- understanding of the project tasks
79
Additional Skills Learned in the Project
• Reading & understanding specification of a complex algorithm
• Design of new hardware architectures based on existing architectures (datapath & controller)• Reading, understanding, and modifying existing
VHDL code• Using embedded resources of modern FPGAs• Characterizing performance of your codes for multiple FPGA families