lab session 1 design of elliptic curve cryptosystem debdeep mukhopadhyay chester rebeiro dept. of...
TRANSCRIPT
LAB SESSION 1
Design of Elliptic Curve Cryptosystem
Debdeep Mukhopadhyay Chester Rebeiro
Dept. of Computer Science and Engineering
Indian Institute of Technology Kharagpur
INDIA
Parameters of the Design
Characteristic 2 field: GF(2233)Random Curve: y2 + xy = x3 + a.x2 + b, where
a = 1 /* Basepoint for the curve, taken from FIPS
186-2 */Base-Point (X,Y):
◦ 233'h0fac9dfcbac8313bb2139f1bb755fef65bc391f8b36f8f8eb7371fd558b
◦ 233'h1006a08a41903350678e58528bebf8a0beff867a7ca36716f7e01f81052
/* The constant b for the curve, from FIPS 186-2 again */◦ 233'h066647ede6c332c7f8c0923bb58213b333b20e
9ce4281fe115f7d8f90adcsrc.nist.gov/publications/fips/archive/fips186-2/fips186-
2.pdf
Design Hierarchy
Elliptic Curve Hierarchy
Code Hierarchy
module ecsmul(clk, nrst, key, sx, sy, done);
regbank regs(clk, cwh, c0r, c1r, a0, a1, a2, a3);ec_alu alu(cwl, a0, a1, a2, a3, c0a, c1a); multiplier mul(minA, minB,
mout);module squarer(a, d);module bquadblk(en, in, sel, out);
Module Multiplier
module multiplier(a, b, d);input wire [232:0] a;input wire [232:0] b;output wire [232:0] d;wire [464:0] mout;
ks233 ks(a, b, mout); (Karatsuba Multiplier)
mod mod1(mout, d); (Modulo Operation)
endmodule
Karatsuba Multiplier
The multiplier operates on 233 bit inputs and gives a 465 bit outputs.
The multiplier uses sub-multipliers, with operands as described in the figure.
The initial multipliers are Simple Karatsuba based, however after a threshold of 16, it was realized by Generalized Karatsuba blocks.
Module ks233
module ks233(a, b, d);
input wire [232:0] a;
input wire [232:0] b;
output wire [464:0] d;
wire [230:0] m1;
wire [232:0] m2;
wire [232:0] m3;
wire [116:0] ahl;
wire [116:0] bhl;
ks117 ksm1(a[116:0], b[116:0], m2);
ks116 ksm2(a[232:117], b[232:117], m1);
assign ahl[115:0] = a[232:117] ^ a[115:0];
assign ahl[116] = a[116];
assign bhl[115:0] = b[232:117] ^ b[115:0];
assign bhl[116] = b[116];
ks117 ksm3(ahl, bhl, m3);
Combining the Partial Results
Since, n=233:◦ d[0…116] = m2[0…116]◦ d[117…232]=m2[117…232] ^
m2[0..115] ^ m1[0…115] ^ m3[0…115]
◦ d[233]=m2[116]^m1[116]^m3[116]
◦ d[234…347]=m2[117…230]^m1[117…230]^m3[117…230]
^m1[0…113]◦ d[348] = m2[231] ^ m3[231] ^
m1[114]◦ d[349] = m2[232] ^ m3[232] ^
m1[115]◦ d[350…464]=m1[116…232]
Generalized Karatsuba
A(x)=a2x2+a1x+a0,
B(x)=b2x2+b1x+b0
D0=a0b0, D1=a1b1, D2=a2b2
D0,1=(a0+a1)(b0+b1), D0,2=(a0+a2)(b0+b2)
D1,2=(a1+a2)(b1+b2)A(x)*B(x)=D2x4+(D1,2-D1-D2)x3+
(D0,2-D0-D2)x2+(D0,1-D0-D1)x+D0
The Generalized Karatsuba Codes
module ks14(a, b, d) and module ks15(a, b, d)
uses this idea for 14 and 15 degree polynomials.Details can be found in the verilog code.
Squarer
module squarer(a, d) is easy in hardware for GF(2) fields.
Modulo Operation
Multiplication and squarer will lead to overflow.◦Hence we need to perform a modulo
operation to bring the result in the field
Modulo Polynomial: x233+x74+1Here, m=233 and n=74(Note: n < m/2)
Squarer Code
module squarer(a, d);
input wire [232:0] a;
output wire [232:0] d;
assign d[0] = a[0] ^ a[196];
assign d[1] = a[117];
assign d[2] = a[1] ^ a[197];
assign d[3] = a[118];
assign d[4] = a[2] ^ a[198];
assign d[5] = a[119];
assign d[6] = a[3] ^ a[199];
assign d[7] = a[120];
assign d[8] = a[4] ^ a[200];
assign d[9] = a[121];
assign d[10] = a[5] ^ a[201];
assign d[11] = a[122];
assign d[12] = a[6] ^ a[202];
assign d[13] = a[123];
assign d[14] = a[7] ^ a[203];
assign d[15] = a[124];
assign d[16] = a[8] ^ a[204];
assign d[17] = a[125];
assign d[18] = a[9] ^ a[205];
assign d[19] = a[126];
…
…
This code performs the squaring as well as modulo reduction.
Squaring leads to under-utilized FPGA circuits.
Quad Itoh Tsujii Inversion
Quad Block
module bquadblk(en, in, sel, out); input wire en; /* If 1 enable data into the quad block */input wire [232:0] in; /* Input to quadblk */input wire [3:0] sel; /* What power is needed */output wire [232:0] out; /* Output from quadblk */
wire [232:0] lin;
quadblk bp4blk(lin, sel, out);
assign lin = (en == 1'b1) ? in : 233'b0;
endmodule
Quad block
module quadblk(a, sel, d);input wire [232:0] a;input wire [3:0] sel;output reg [232:0] d;
pow4 p4_1(a, d1);pow4 p4_2(d1, d2);pow4 p4_3(d2, d3);pow4 p4_4(d3, d4);pow4 p4_5(d4, d5);pow4 p4_6(d5, d6);pow4 p4_7(d6, d7);pow4 p4_8(d7, d8);pow4 p4_9(d8, d9);pow4 p4_10(d9, d10);pow4 p4_11(d10, d11);pow4 p4_12(d11, d12);pow4 p4_13(d12, d13);pow4 p4_14(d13, d14);
always @(sel ord1 or d2 or d3 or d4 or d5 or d6 or d7 or d8 or d9
or d10 or d11 or d12 or d13 or d14)case (sel)
4'd1: d <= d1;4'd2: d <= d2;4'd3: d <= d3;4'd4: d <= d4;4'd5: d <= d5;4'd6: d <= d6;4'd7: d <= d7;4'd8: d <= d8;4'd9: d <= d9;4'd10: d <= d10;4'd11: d <= d11;4'd12: d <= d12;4'd13: d <= d13;4'd14: d <= d14;default: d<= 233'hx;
endcaseendmodule
Quad circuit
module pow4(a, d);
input wire [232:0] a;output wire [232:0] d;
assign d[0] = a[0] ^ a[196] ^ a[98];assign d[1] = a[138] ^ a[175];assign d[2] = a[117] ^ a[178] ^ a[215];assign d[3] = a[59] ^ a[218];assign d[4] = a[1] ^ a[197] ^ a[99];assign d[5] = a[139] ^ a[176];assign d[6] = a[118] ^ a[179] ^ a[216];assign d[7] = a[60] ^ a[219];assign d[8] = a[2] ^ a[198] ^ a[100];assign d[9] = a[140] ^ a[177];assign d[10] = a[119] ^ a[180] ^ a[217];assign d[11] = a[61] ^ a[220];assign d[12] = a[3] ^ a[199] ^ a[101];assign d[13] = a[141] ^ a[178];assign d[14] = a[120] ^ a[181] ^ a[218];assign d[15] = a[62] ^ a[221];….
This code performs the quading as well as modulo reduction.
Quading leads to better-utilized FPGA circuits.
The ALU for the ECC Processor
The verilog code for ALU
module ec_alu(cw, a0, a1, a2, a3, c0, c1);input wire [232:0] a0, a1, a2, a3; /* the inputs to
the alu */input wire [9:0] cw; /* the control word */output wire [232:0] c0, c1; /* the alu outputs
*/
/* Temporary results */wire [232:0] a0sq, a0qu;wire [232:0] a1sq, a1qu;wire [232:0] a2sq, a2qu;wire [232:0] sa2, sa4, sa5, sa7, sa8, sa8_1;wire [232:0] sc1;wire [232:0] sd2, sd2_1;
/* Multiplier inputs and output */wire [232:0] minA, minB, mout;
multiplier mul(minA, minB, mout);squarer sq1_p0(a0, a0sq);squarer sq_p1(a1, a1sq);squarer sq_p2(a2, a2sq);
squarer sq2_p2(a2sq, a2qu);squarer sq2_p1(a1sq, a1qu);squarer sq2_p3(a0sq, a0qu);
/* Choose the inputs to the Multiplier */mux8 muxA(a0, a0sq, a2, sa7, sd2, a1, a1qu, 233'd0, cw[2:0], minA);mux8 muxB(a1, a1sq, sa4, sa8, sd2_1, a3, a2qu,a1qu, cw[5:3], minB);
/* Choose the outputs of the ALU */mux4 muxC(mout, sa2, a1sq, sc1, cw[7:6], c0); mux4 muxD(sa8_1, sa5, a1qu, sd2, cw[9:8], c1);
assign sa2 = mout ^ a2;assign sa4 = a1sq ^ a2;assign sa5 = mout ^ a2sq ^ a0;assign sa7 = a0 ^ a2;assign sa8 = a1 ^ a3;assign sa8_1 = mout ^ a0;
assign sc1 = mout ^ a3;
assign sd2 = a0qu ^ a1;assign sd2_1 = a2sq ^ a3 ^ a1;
endmodule
Next Lab Session on ECC Processor