lab session 1 design of elliptic curve cryptosystem debdeep mukhopadhyay chester rebeiro dept. of...

LAB SESSION 1

Design of Elliptic Curve Cryptosystem

Debdeep Mukhopadhyay Chester Rebeiro

Dept. of Computer Science and Engineering

Indian Institute of Technology Kharagpur

INDIA

Parameters of the Design

Characteristic 2 field: GF(2233)Random Curve: y2 + xy = x3 + a.x2 + b, where

a = 1 /* Basepoint for the curve, taken from FIPS

186-2 */Base-Point (X,Y):

◦ 233'h0fac9dfcbac8313bb2139f1bb755fef65bc391f8b36f8f8eb7371fd558b

◦ 233'h1006a08a41903350678e58528bebf8a0beff867a7ca36716f7e01f81052

/* The constant b for the curve, from FIPS 186-2 again */◦ 233'h066647ede6c332c7f8c0923bb58213b333b20e

9ce4281fe115f7d8f90adcsrc.nist.gov/publications/fips/archive/fips186-2/fips186-

2.pdf

Design Hierarchy

Elliptic Curve Hierarchy

Code Hierarchy

module ecsmul(clk, nrst, key, sx, sy, done);

regbank regs(clk, cwh, c0r, c1r, a0, a1, a2, a3);ec_alu alu(cwl, a0, a1, a2, a3, c0a, c1a); multiplier mul(minA, minB,

mout);module squarer(a, d);module bquadblk(en, in, sel, out);

Module Multiplier

module multiplier(a, b, d);input wire [232:0] a;input wire [232:0] b;output wire [232:0] d;wire [464:0] mout;

ks233 ks(a, b, mout); (Karatsuba Multiplier)

mod mod1(mout, d); (Modulo Operation)

endmodule

Karatsuba Multiplier

The multiplier operates on 233 bit inputs and gives a 465 bit outputs.

The multiplier uses sub-multipliers, with operands as described in the figure.

The initial multipliers are Simple Karatsuba based, however after a threshold of 16, it was realized by Generalized Karatsuba blocks.

Module ks233

module ks233(a, b, d);

input wire [232:0] a;

input wire [232:0] b;

output wire [464:0] d;

wire [230:0] m1;

wire [232:0] m2;

wire [232:0] m3;

wire [116:0] ahl;

wire [116:0] bhl;

ks117 ksm1(a[116:0], b[116:0], m2);

ks116 ksm2(a[232:117], b[232:117], m1);

assign ahl[115:0] = a[232:117] ^ a[115:0];

assign ahl[116] = a[116];

assign bhl[115:0] = b[232:117] ^ b[115:0];

assign bhl[116] = b[116];

ks117 ksm3(ahl, bhl, m3);

Combining the Partial Results

Since, n=233:◦ d[0…116] = m2[0…116]◦ d[117…232]=m2[117…232] ^

m2[0..115] ^ m1[0…115] ^ m3[0…115]

◦ d[233]=m2[116]^m1[116]^m3[116]

◦ d[234…347]=m2[117…230]^m1[117…230]^m3[117…230]

^m1[0…113]◦ d[348] = m2[231] ^ m3[231] ^

m1[114]◦ d[349] = m2[232] ^ m3[232] ^

m1[115]◦ d[350…464]=m1[116…232]

Generalized Karatsuba

A(x)=a2x2+a1x+a0,

B(x)=b2x2+b1x+b0

D0=a0b0, D1=a1b1, D2=a2b2

D0,1=(a0+a1)(b0+b1), D0,2=(a0+a2)(b0+b2)

D1,2=(a1+a2)(b1+b2)A(x)*B(x)=D2x4+(D1,2-D1-D2)x3+

(D0,2-D0-D2)x2+(D0,1-D0-D1)x+D0

The Generalized Karatsuba Codes

module ks14(a, b, d) and module ks15(a, b, d)

uses this idea for 14 and 15 degree polynomials.Details can be found in the verilog code.

Squarer

module squarer(a, d) is easy in hardware for GF(2) fields.

Modulo Operation

Multiplication and squarer will lead to overflow.◦Hence we need to perform a modulo

operation to bring the result in the field

Modulo Polynomial: x233+x74+1Here, m=233 and n=74(Note: n < m/2)

Squarer Code

module squarer(a, d);

input wire [232:0] a;

output wire [232:0] d;

assign d[0] = a[0] ^ a[196];

assign d[1] = a[117];

assign d[2] = a[1] ^ a[197];

assign d[3] = a[118];

assign d[4] = a[2] ^ a[198];

assign d[5] = a[119];

assign d[6] = a[3] ^ a[199];

assign d[7] = a[120];

assign d[8] = a[4] ^ a[200];

assign d[9] = a[121];

assign d[10] = a[5] ^ a[201];

assign d[11] = a[122];

assign d[12] = a[6] ^ a[202];

assign d[13] = a[123];

assign d[14] = a[7] ^ a[203];

assign d[15] = a[124];

assign d[16] = a[8] ^ a[204];

assign d[17] = a[125];

assign d[18] = a[9] ^ a[205];

assign d[19] = a[126];

…

…

This code performs the squaring as well as modulo reduction.

Squaring leads to under-utilized FPGA circuits.

Quad Itoh Tsujii Inversion

Quad Block

module bquadblk(en, in, sel, out); input wire en; /* If 1 enable data into the quad block */input wire [232:0] in; /* Input to quadblk */input wire [3:0] sel; /* What power is needed */output wire [232:0] out; /* Output from quadblk */

wire [232:0] lin;

quadblk bp4blk(lin, sel, out);

assign lin = (en == 1'b1) ? in : 233'b0;

endmodule

Quad block

module quadblk(a, sel, d);input wire [232:0] a;input wire [3:0] sel;output reg [232:0] d;

pow4 p4_1(a, d1);pow4 p4_2(d1, d2);pow4 p4_3(d2, d3);pow4 p4_4(d3, d4);pow4 p4_5(d4, d5);pow4 p4_6(d5, d6);pow4 p4_7(d6, d7);pow4 p4_8(d7, d8);pow4 p4_9(d8, d9);pow4 p4_10(d9, d10);pow4 p4_11(d10, d11);pow4 p4_12(d11, d12);pow4 p4_13(d12, d13);pow4 p4_14(d13, d14);

always @(sel ord1 or d2 or d3 or d4 or d5 or d6 or d7 or d8 or d9

or d10 or d11 or d12 or d13 or d14)case (sel)

4'd1: d <= d1;4'd2: d <= d2;4'd3: d <= d3;4'd4: d <= d4;4'd5: d <= d5;4'd6: d <= d6;4'd7: d <= d7;4'd8: d <= d8;4'd9: d <= d9;4'd10: d <= d10;4'd11: d <= d11;4'd12: d <= d12;4'd13: d <= d13;4'd14: d <= d14;default: d<= 233'hx;

endcaseendmodule

Quad circuit

module pow4(a, d);

input wire [232:0] a;output wire [232:0] d;

assign d[0] = a[0] ^ a[196] ^ a[98];assign d[1] = a[138] ^ a[175];assign d[2] = a[117] ^ a[178] ^ a[215];assign d[3] = a[59] ^ a[218];assign d[4] = a[1] ^ a[197] ^ a[99];assign d[5] = a[139] ^ a[176];assign d[6] = a[118] ^ a[179] ^ a[216];assign d[7] = a[60] ^ a[219];assign d[8] = a[2] ^ a[198] ^ a[100];assign d[9] = a[140] ^ a[177];assign d[10] = a[119] ^ a[180] ^ a[217];assign d[11] = a[61] ^ a[220];assign d[12] = a[3] ^ a[199] ^ a[101];assign d[13] = a[141] ^ a[178];assign d[14] = a[120] ^ a[181] ^ a[218];assign d[15] = a[62] ^ a[221];….

This code performs the quading as well as modulo reduction.

Quading leads to better-utilized FPGA circuits.

The ALU for the ECC Processor

The verilog code for ALU

module ec_alu(cw, a0, a1, a2, a3, c0, c1);input wire [232:0] a0, a1, a2, a3; /* the inputs to

the alu */input wire [9:0] cw; /* the control word */output wire [232:0] c0, c1; /* the alu outputs

*/

/* Temporary results */wire [232:0] a0sq, a0qu;wire [232:0] a1sq, a1qu;wire [232:0] a2sq, a2qu;wire [232:0] sa2, sa4, sa5, sa7, sa8, sa8_1;wire [232:0] sc1;wire [232:0] sd2, sd2_1;

/* Multiplier inputs and output */wire [232:0] minA, minB, mout;

multiplier mul(minA, minB, mout);squarer sq1_p0(a0, a0sq);squarer sq_p1(a1, a1sq);squarer sq_p2(a2, a2sq);

squarer sq2_p2(a2sq, a2qu);squarer sq2_p1(a1sq, a1qu);squarer sq2_p3(a0sq, a0qu);

/* Choose the inputs to the Multiplier */mux8 muxA(a0, a0sq, a2, sa7, sd2, a1, a1qu, 233'd0, cw[2:0], minA);mux8 muxB(a1, a1sq, sa4, sa8, sd2_1, a3, a2qu,a1qu, cw[5:3], minB);

/* Choose the outputs of the ALU */mux4 muxC(mout, sa2, a1sq, sc1, cw[7:6], c0); mux4 muxD(sa8_1, sa5, a1qu, sd2, cw[9:8], c1);

assign sa2 = mout ^ a2;assign sa4 = a1sq ^ a2;assign sa5 = mout ^ a2sq ^ a0;assign sa7 = a0 ^ a2;assign sa8 = a1 ^ a3;assign sa8_1 = mout ^ a0;

assign sc1 = mout ^ a3;

assign sd2 = a0qu ^ a1;assign sd2_1 = a2sq ^ a3 ^ a1;

endmodule

Next Lab Session on ECC Processor

lab session 1 design of elliptic curve cryptosystem debdeep mukhopadhyay chester rebeiro dept. of...

Documents