ece 545 – introduction to vhdl ece 645—project 2 project options

ECE 545 – Introduction to VHDL

ECE 645—Project 2Project Options

2

Project 2 Overview

• Project 2 will involve the FPGA implementation of a complex digital arithmetic function

• The project will have an application in either cryptography or signal processing

• Due to the scope of the project, students should be in groups of 3

• The specification and scope of the project will be an interactive process between groups and the instructor

3

Project Options

Each group will involve the FPGA implementation of one of the following projects:

Cryptography related1. Trial division sieve2. Elliptic curve method of factoring3. RSA encryption & decryption with Montgomery multipliers based on

carry save adders

Signal processing related4. Iterative and pipeline CORDIC (coordinate rotation digital computer)

processors5. Finite impulse response filter architectures for FPGA implementations6. Direct digital frequency synthesis

Cryptography Projects Background

ECE 645 – Computer Arithmetic

5

RSA Public Key Cryptosystem

M C = f(M) = Me mod N C

M = f-1(C) = Cd mod N

PUBLIC KEY

PRIVATE KEY

N = P Q P, Q - large prime numbers

e d 1 mod ((P-1)(Q-1))

6

RSA Keys

PUBLIC KEY PRIVATE KEY

{ e, N } { d, P, Q }

N = P Q

e d 1 mod ((P-1)(Q-1))

P, Q - large prime numbers

7

Polynomial Selection

Linear Algebra

Square Root

Relation Collection

Sieving

Cofactoring

200 bit

numbers & 350 bit Trial division

ECM method

Factoring 1024-bit RSA keys using Number Field Sieve (NFS)

Topic 1: Trial Division Sieve


9

RSA Keys

PUBLIC KEY PRIVATE KEY

{ e, N } { d, P, Q }

N = P Q

e d 1 mod ((P-1)(Q-1))

P, Q - large prime numbers

10

Topic 1: Trial Division Sieve (1)

Given:

Inputs:Variables:1. Integers N1, N2, N3, .... each of the size of k-bitsConstants:2. Factor base = set of all primes smaller smaller than a certain bound B = { p1=2, p2=3, p3=5, ... , pt ≤ B }

Parameters of interest: 4 ≤ k ≤ 512 3 ≤ B ≤ 105

11

Topic 1: Trial Division Sieve (2)Required:

Outputs:For each integer Ni:A list of primes from the factor base that divides N i, and the number of times each prime divides Ni.

For example if Ni = p1

e1 · p2e2 · p3

e3 · Mi, where Mi is not divisible by any prime belonging to a factor base, thenthe output is {p1, e1}, {p2, e2}, {p3, e3}

12

Topic 1: Trial Division Sieve (3)

Example:

Constants:k=10, B=5Factor base = {2, 3, 5}

Variables: N1 = 408 = 23 · 3 · 17 N2 = 630 = 2 · 32 · 5 · 7

Outputs: {2, 3}, {3, 1} {2, 1}, {3, 2}, {5, 1}

Topic 2: Elliptic Curve Method of Factoring


14

2 3 1 mod p ( 23)Y X X p

Points fullfiling the equation of the curve

0

5

10

15

20

25

0 5 10 15 20X

Y

special point

(point at infinity)

such that:

P P P

all points of the curve

, 2 ,3 ,.............., , ( 1) , 2P P P nP n P P P

P=(6,19)

Q=(7,12)

R=P+Q=(13,7)

A

AdditionP=(3,13)

2P=P+P=(7,11)D

Doubling

P:

Elliptic Curves

15

Inputs : N – number to be factoredE – elliptic curve

P0 – point of the curve E : initial point B1 – smoothness bound for Phase1 B2 – smoothness bound for Phase2

Outputs: q - factor of N, 1 < q ≤ N or FAIL

ECM Algorithm

16

0 0

0

1

1

0 0

1: such that - consecutive primes

- largest exponent such that

2: ( : : )

3: gcd( , )

4 : if 1

5: r

i

i

i

ei ip

ei i

Q Q

Q

k p p B

e p B

Q kP x z

q z N

q

eturn (factor of )

6: else

7: go to Phase 2

8: end if

q N

precomputations

postcomputations

main computations

ECM Algorithm Phase 1

17

0 0 0

0

1 2

0

09: 1

10: for each prime to do

11: ( , , )

12: (mod )

13: end for

14: gcd( , )

15: if 1 then

16: return

17: else

18: return FAIL

19: end i

pQ pQ pQ

pQ

d

p B B

x y z pQ

d d z N

q d N

q

q

f

postcomputations

main computations

ECM Algorithm Phase 2

18

ECM

k·P

P+Q 2P

x·y mod N x+y mod N x-y mod N

Top level

Medium level

Point addition

Low level

Moduarmultiplication

Modularaddition

Modularsubtraction

Scalar multiplication

Point doubling

Elliptic curvepoint operations

Modular arithmetic(ring operations)

Functional units

Controlunit

Host computer

Hierarchy of Elliptic Curve Operations

Topic 3: RSA Encryption & Decryptionwith Montgomery Multipliers based on

Carry Save Adders


20

M C = f(M) = Me mod N C

M = f-1(C) = Cd mod N

PUBLIC KEY

PRIVATE KEY

N = P Q P, Q - large prime numbers

e d 1 mod ((P-1)(Q-1))

RSA as a Trap-Door One-Way Function

21

Right-to-left binary exponentiation

Left-to-right binary exponentiation

E = (eL-1, eL-2, …, e1, e0)2

Y = 1;S = X;for i=0 to L-1 { if (ei == 1) Y = Y S mod N; S = S2 mod N; }

Y = 1;for i=L-1 downto 0 { Y = Y2 mod N; if (ei == 1) Y = Y X mod N; }

Exponentiation: Y = XE mod N

22

C = A B mod M

A

Integer domain Montgomery domain

A’ = A 2k mod M

B B’ = B 2k mod M

C’ = MP(A’, B’, M) = = A’ B’ 2-k mod M = = (A 2k) (B 2k) 2-k mod M = = A B 2k mod M

C’ = C 2k mod M C = A B

A, B, M – k-bit numbers

Montgomery Modular Multiplication

23

A’ = MP(A, 22k mod M, M)

C = MP(C’, 1, M)

A A’

C C’

Montgomery Modular Multiplication

24

=MPCP P

dP

mod =MQCQ Q

dQ

mod

CP = C mod PdP = d mod (P-1)

CQ = C mod QdQ = d mod (Q-1)

= modCM

d

N

M = MP ·RQ + MQ ·RP mod Nwhere

RP = (P-1 mod Q) ·P = PQ-1 mod N

RQ = (Q-1 mod P) ·Q= QP-1 mod N

Fast Modular Exponentiation using Chinese Remainder Theorem

Topic 4: Iterative and Pipeline CORDIC (Coordinate Rotation Digital Computer)

Processors


26

-

If we have a computationally efficient way of rotating a vector, we can evaluate cos, sin, and tan–1 functions

Rotation by an arbitrary angle is difficult, so we:

Perform psuedorotations that require simpler operations Use special angles to synthesize the desired angle z

z = (1) +

(2) + . . . + (m)

Key ideas in CORDIC

COordinate Rotation DIgital Computer used this method in 1950s; modern electronic calculators also use it

z

(cos z, sin z)

(1, 0)

tan y

(1, y)

–1

start at (1, 0) rotate by z get cos z, sin z

start at (1, y) rotate until y = 0 rotation amount is tan y –1

Rotations and Pseudo-Rotations in CORDIC

27

Fig. 22.1 A pseudorotation step in CORDIC

x

y Rotation

Pseudo- rotation

O

R (i+1)

R (i) (i)

E (i+1) E (i+1)

E (i)

y (i+1)

x (i+1)

y (i)

x (i)

Our strategy: Eliminate the terms (1 + tan2

(i))1/2 and choose the angles (i)) so that tan (i) is a power of 2; need two shift-adds

x(i+1) = x(i) cos (i) – y(i) sin (i) = (x(i) – y(i) tan (i)) / (1 + tan2 (i))1/2

y(i+1) = y(i) cos (i) + x(i) sin (i) = (y(i) + x(i) tan (i)) / (1 + tan2 (i))1/2

z(i+1) = z(i) – (i) Recall that cos = 1 / (1 + tan2 )1/2

Rotating a Vector by an Angle

28

Fig. 22.1 A pseudorotation step in CORDIC

x

y Rotation

Pseudo- rotation

O

R (i+1)

R (i) (i)

E (i+1) E (i+1)

E (i)

y (i+1)

x (i+1)

y (i)

x (i)

Pseudorotation: Whereas a real rotation does not change the length R(i) of the vector, a pseudorotation step increases its length to:

R(i+1) = R(i) / cos (i) = R(i) (1 + tan2

(i))1/2

x(i+1) = x(i) – y(i) tan (i)

y(i+1) = y(i) + x(i) tan (i)

z(i+1) = z(i) – (i)

Pseudorotating a Vector by an Angle

29

CORDIC iteration: In step i, we pseudorotate by an angle whose tangent is di 2–i (the angle

e(i) is fixed, only direction di is to be picked)

x(i+1) = x(i) – di y(i) 2–i

y(i+1) = y(i) + di x(i) 2–i

z(i+1) = z(i) – di tan –1 2–i

= z(i) – di e(i) –––––––––––––––––––––––––––––––– i –––––––––––––––––––––––––––––––– 0 45.0 0.785 398 163 1 26.6 0.463 647 609 2 14.0 0.244 978 663 3 7.1 0.124 354 994 4 3.6 0.062 418 810 5 1.8 0.031 239 833 6 0.9 0.015 623 728 7 0.4 0.007 812 341 8 0.2 0.003 906 230 9 0.1 0.001 953 123––––––––––––––––––––––––––––––––

e(i) in degrees(approximate)

e(i) in radians(precise)

Table 22.1 Value of the function e(i) = tan

–1 2–i,in degrees and radians, for 0 i 9

Example: 30 angle

30.0 45.0 – 26.6 + 14.0 – 7.1 + 3.6 + 1.8 – 0.9 + 0.4 – 0.2 + 0.1 = 30.1

Basic CORDIC Iterations

30

Project Task

• Implement iterative and pipeline solutions to CORDIC in various modes

Topic 5: Finite Impulse Response Filter Architectures for FPGA

Implementations


32

• Digital filters are widely used in digital communications and audio/video processing.

• In particular, finite impulse response (FIR) filters are used for their ease of implementation and stability.

FIR Filters

33

• As seen above digital filters, boxed in blue, play a crucial role in digital communication chips such as Ethernet transceivers, cable modems, DSL modems, satellite receivers, mobile phones, etc.

Example: Gigabit Ethernet

34

x(n) Z-1 Z-1 Z-1

h0 h1 h2 hN-1

• An FIR filter implements a convolution in the time-domain

• Critical path of N-tap filter:• N-1 adds + 1 multiply

• Arithmetic complexity of N-tap filter modeled as:• N multiplications/sample + N-1 adds/sample

y(n)

Direct Form Filter

35

Project Task: FIR Architecture Explorations and Optimizations

• Transpose form• Parallel subexpression sharing• Canonic signed digit representations using carry-

save addition• Parallel, word-serial, bit-serial implementation• Xilinx DSP multipliers and multiply-accumulate

structures

Topic 6: Direct Digital Frequency Synthesis


37

Direct Digital Frequency Synthesis

• Direct digital frequency synthesis is used to generate sin and cosine functions for digital communication applications

• Used in many applications: cell phones, cable modems, satellite receivers, etc.

38

DDFS: Basic Understanding and Architecture

• Output of DDFS is a sine and cosine waveform• k = frequency control word• L = accumulator bit width• N=2L=number of slots in ROM• D=number of output bits• phi(n) = (nk) mod N• 1/T = clock frequency• f0 = 1/ (NT) = lowest frequency output (i.e. resolution)• fc = kf0 = k/(NT) = desired frequency, output will be cos(2π fcnT) and sin(2π fcnT)• fmax = greatest frequency achievable = 1/(2T) = ½ fclk

+ N slots of ROM

k D

D

L cos(2π/N * phi(n))

sin(2π/N * phi(n))

39

DDFS: Example Output

40

Project task

• The ROM-based architecture is simplistic; new architectures which are superior exist

• Investigate various architectures of DDFS and implement in FPGA

ece 545 – introduction to vhdl ece 645—project 2 project options

Documents