good practices for designing cryptographic primitives in hardware · 2016-01-28 · latency vs...

Post on 06-Jul-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Good Practices for Designing

Cryptographic Primitives in Hardware

Miroslav Knežević

NXP Semiconductors

miroslav.knezevic@nxp.com

January 25, 2016

School on Design for a Secure IoT, Tenerife, 2016

THINGS

Smart Plugs

January 28, 20163.

Smart Door Locks

January 28, 20164.

Smart Thermostats

January 28, 20165.

Smart Smoke Detectors

January 28, 20166.

Smart Light Bulbs

January 28, 20167.

Connected Cars

January 28, 20168.

WHY HARDWARE?

…to run software, obviously!

January 28, 201610.

But software alone is…

January 28, 201611.

Slow Insecure Energy inefficient

Super Powers of Embedded HW

January 28, 201613.

Performance Power

(Energy)

AreaSecurity*

* In the original slide deck Superman was held responsible for Security. During the coffee break, students suggested that Batman would be a better representative and I agree he is. So sorry, Mr. Kent!

Area – Designing the Smallest Block Cipher

January 28, 201614.

A A

VDD

January 28, 201615.

MOSFET Channel Length

S

D

G

NAND gate

January 28, 201616.

• Smallest logical gate with two inputs.

• GE (gate equivalence) = physical area

of a single NAND gate.

• (Ab)used for comparing HW designs

across different CMOS technologies.

XOR gate

January 28, 201617.

2-3 GE

Modern Lightweight Ciphers

January 28, 201618.

< 1000 GE

AES (128-bit key, ENC only)

January 28, 201619.

2500 GE

January 28, 201620.

Advances in CMOS Technology

140 nm 40 nm 10 nm

January 28, 201621.

ARM Cortex M0 example (~20 kGE)

16.25 mm

CMOS

40 nm

CMOS

140 nm

Block Cipher – HW Perspective

January 28, 201622.

Memory

Key size ≥ 80 bits

Block size ≥ 32 bits

Round function

+

Key schedule

+

Control logic

Minimize!

KATAN – The Smallest Block Cipher

January 28, 201623.

January 28, 201624.

KATAN32 = 315 GE + 508 bits of ROM

Only 508 bits of expanded key!

KATAN – The Smallest Block Cipher

KATAN in numbers

January 28, 201625.

It’s (one of) the smallest known cipher(s): < 500 GE

But it’s not very fast: 254 clock cycles

Still scalable: 3 times faster for negligible area overhead

KATAN vs Competition

January 28, 201626.

SPECK

IBM130

Synopsys

≥ 580 GE

KLEIN

TSMC180

Synopsys

≥ 1.3 kGE

Piccolo

130nm

Synopsys

≥ 700 GE

LED

180nm

Synopsys

≥ 700 GE

KATAN

NXP140

Cadence

≥ 460 GEPRESENT

UMC180

IHP250

AMIS350

Synopsys

~1kGE

SIMON

IBM130

Synopsys

≥ 520 GE

<5

6.2

5

6.6

7 7.6

7

NXP 90NM UMC 130NM UMC 180NM NANGATE 45NM

AREA OF SCAN-FF [GE]

Memory Elements in different CMOS Technologies

January 28, 201627.

52

1 73

7 91

8

1192

13

40

73

8

10

60 1

32

9

17

28 19

50

75

9

1103 1

36

7

17

68 20

12

86

8

12

56

15

71

20

71 2

32

3

VERSION 1 VERSION 2 VERSION 3 VERSION 4 VERSION 5

AREA [GE]

SPONGENT in different CMOS Technologies

January 28, 201628.

up to 70% difference!

How can we do a fair comparison?

January 28, 201629.

Difficult in practice.

But why not using an open-core library at least?

http://www.nangate.com/

Performance – Designing the Fastest Block Cipher

January 28, 201630.

Latency vs Throughput

January 28, 201631.

Latency = 15 s

Throughput = 0.067 beer/s

12

3

6

9Serial

processing

Latency vs Throughput

January 28, 201632.

Latency = 15 s

Throughput = 0.2 beer/s

12

3

6

9Parallel

processing!

Latency vs Throughput

January 28, 201633.

Latency = 15 s

Throughput = 0.2 beer/s

12

3

6

9Pipelining!

Latency vs Throughput

January 28, 201634.

Latency = 5 s

Throughput = 0.2 beer/s

12

3

6

9

bottom-up!

Unrolling!

128 128 8 MDS LIGHT

128 128 4 BINARY NO

64 64 4 MDS LIGHT

64 64, 96, 128 4 BINARY LIGHT

64 80, 128 4BIT

PERMUTATIONLIGHT

64 64, 80, 96 4 MDS LIGHT

64 64, 128 4 MDS NO

Latency of Existing Ciphers – Is Lightweight = “Light + Wait”?

January 28, 201635.

BLOCK-SIZE KEY-SIZE S-BOX P-LAYER K-SCHEDULE

MCRYPTON

AES

NOEKEON

MINI-AES

PRESENT

KLEIN

LED

Unrolled HW Architectures

January 28, 201636.

17.8

15.3 2

0.3 2

5.3

31.2

46.6

9.8

9.8

9.8

9.9

14.8

15.5

14.8

14.7

20.2

16.4 2

1.4 2

6.4

32.8

48.2

10.8

10.8

11 12

17 17.4

16.4

16.6

LATENCY [NS]

1-cycle 2-cycle

Results – Latency (CMOS 90 nm)

January 28, 201637.

Number of Rounds vs Key Size

January 28, 201638.

366.6

48.2 63.7 79.9

128.7

193.1

41.3

40.4

41.4

40

102.5

49.5 72.3

73.8

191.8

24.9

32.6

41.3 63.5 9

6

20.9

21.1

21

22

49.6

27.1

37.6

37.1

AREA [KGE]

1-cycle 2-cycle

Results – Area (CMOS 90 nm)

January 28, 201639.

• Use small S-boxes (e.g. 5-bit, 4-bit, 3-bit)

• Almost everything follows the normal distribution. So does the S-box!

Low Latency Encryption

S-box

January 28, 201640.

slide credit:

Gregor Leander

choose me!

Low Latency Encryption

Number of Rounds vs Round Complexity

January 28, 201641.

• Not too low complexity.

• Reduce the number of rounds at the cost of (slightly) heavier round.

• Number of rounds should be independent of the key schedule.

• Use constant addition instead of a key schedule (if possible).

Low Latency Encryption

Key Schedule

January 28, 201642.

• Use involution where possible: 𝑓 𝑓 𝑥 = 𝑥.

• Make Encryption and Decryption procedures similar.

• BUT: think application oriented – sometimes it is beneficial to have

asymmetric constructions.

Low Latency Encryption

Encryption vs Decryption

January 28, 201643.

Low Latency Encryption

Meet PRINCE

January 28, 201644.

𝛼-reflection property:

Low Latency Encryption

Meet PRINCE

January 28, 201645.

17.8

15.3

20.3

25.3

31.2

46.6

9.8

9.8

9.8

9.9

14.8

15.5

14.8

14.7

8

LATENCY [NS]

Low Latency Encryption

Meet PRINCE

January 28, 201646.

366.6

48.2 63.7 79.9

128.7

193.1

41.3

40.4

41.4

40

102.5

49.5 7

2.3

73.8

17

AREA [KGE]

Power/Energy – Future of Lightweight Crypto

January 28, 201647.

History of Lightweight Crypto (Area)

January 28, 201648.

0

500

1000

1500

2000

2500

3000

3500

1970 1980 1990 2000 2010 2020

AR

EA

(G

E)

YEAR

Lightweight Block Ciphers

DESAES

PRESENT

KATAN

History of Lightweight Crypto (Latency)

January 28, 201649.

1

10

100

1000

10000

1970 1980 1990 2000 2010 2020

LA

TE

NC

Y (

# C

LO

CK

CY

CLE

S)

YEAR

Lightweight Block Ciphers

History of Lightweight Crypto (Energy*)

January 28, 201650.

1

10

100

1000

10000

100000

1000000

10000000

100000000

1970 1980 1990 2000 2010 2020

AR

EA

* L

AT

EN

CY

YEAR

Lightweight Block Ciphers

Future of Lightweight Crypto (Energy*)

January 28, 201651.

1

10

100

1000

10000

100000

1000000

10000000

100000000

1970 1980 1990 2000 2010 2020

AR

EA

* L

AT

EN

CY

YEAR

Lightweight Block Ciphers

PRINCE

Futu

re o

f LW

C

Energy

January 28, 201652.

=

Power

January 28, 201653.

=

Passive RFID: Low Power Applications

January 28, 201654.

CONTROL

MEMORY

CRYPTORF

NETWORK

INTERFACE RF

RFID tag

Reader

Anything Battery Powered: Low Energy Applications

January 28, 201655.

Every mW matters!

January 28, 201656.

Total number of mobile devices in 2015 = 9.5 billion*

Average (regular) power consumption of a smartphone = 160 mW**

Total energy spent = €2.8 billion*** a year!

*** average electricity price in 2014 in EU was €0.208 per kWh.

** An Analysis of Power Consumption in a Smartphone, A Carroll, G Heiser, USENIX 2010.

* Mobile Statistics Report 2015-2019, The Radicati Group Inc.

Cisco estimates…

January 28, 201657.

… 50 billion* connected devices by 2020

* http://www.cisco.com/web/solutions/trends/iot/portfolio.html

Intel says…

January 28, 201658.

… 200 billion* smart devices by 2020

* http://www.intel.com/content/www/us/en/internet-of-things/infographics/guide-to-iot.html

Moving Bits vs Moving People & Things

January 28, 201659.

* http://www.tech-pundit.com/wp-content/uploads/2013/07/Cloud_Begins_With_Coal.pdf

World’s ICT Energy Consumption

January 28, 201660.

* http://www.tech-pundit.com/wp-content/uploads/2013/07/Cloud_Begins_With_Coal.pdf

The ICT ecosystem uses about 1500 TWh of electricity annually and

approaches 10% of world electricity generation!

What can Crypto do about it?

January 28, 201661.

Become Lightweight Crypto!

Power Consumption in (Crypto) HW

January 28, 201662.

𝑃𝑡𝑜𝑡 = 𝑃𝑠𝑤𝑖𝑡𝑐ℎ𝑖𝑛𝑔 + 𝑃𝑙𝑒𝑎𝑘𝑎𝑔𝑒

𝑃𝑠𝑤𝑖𝑡𝑐ℎ𝑖𝑛𝑔 ≈ 𝐶𝑒𝑓𝑓 ∙ 𝑉𝐷𝐷2 ∙ 𝑓𝑐𝑙𝑘∙ 𝑠𝑤

𝑃𝑠𝑤𝑖𝑡𝑐ℎ𝑖𝑛𝑔 ≫ 𝑃𝑙𝑒𝑎𝑘𝑎𝑔𝑒

Energy Consumption in (Crypto) HW

January 28, 201663.

𝐸 = 𝑃 ∙ 𝑡 = 𝑃 ∙𝑁

𝑓𝑐𝑙𝑘

𝐸 ≈ 𝐶𝑒𝑓𝑓 ∙ 𝑉𝐷𝐷2 ∙ 𝑁 ∙ 𝑠𝑤

• Reduce circuit area (e.g. serializing): 𝐶𝑒𝑓𝑓 ↓, but 𝑁 ↑

• Reduce switching activity (e.g. clock gating): 𝑠𝑤 ↓

• Move to smaller CMOS technologies: 𝐶𝑒𝑓𝑓 ↓, 𝑉𝐷𝐷 ↓, but 𝑃𝑙𝑒𝑎𝑘𝑎𝑔𝑒 ↑

• Reduce the operating clock frequency: 𝑓𝑐𝑙𝑘 ↓

• Reduce the latency: 𝑁 ↓

Reducing Power and Energy Consumption

January 28, 201664.

𝑃 ≈ 𝐶𝑒𝑓𝑓 ∙ 𝑉𝐷𝐷2 ∙ 𝑠𝑤 ∙ 𝑓𝑐𝑙𝑘

𝐸 ≈ 𝐶𝑒𝑓𝑓 ∙ 𝑉𝐷𝐷2 ∙ 𝑠𝑤 ∙ 𝑁

Security – Know Your Enemy

January 28, 201666.

* In the original slide deck Superman was held responsible for Security. During the coffee break, students suggested that Batman would be a better representative and I agree he is. So sorry, Mr. Kent!

January 28, 201667.

Power (Current) Measurement Setup

January 28, 201668.

RAM CPU

COPROS

PE

RIP

HE

RA

LS

VDD

R

RAM CPU

COPROS

PE

RIP

HE

RA

LS

VDD

𝑖EM-probes

𝑖

SIMPLE POWER

ANALYSIS

SPA – Public Key Crypto

January 28, 201670.

• Modular exponentiation: modkx m N

for down to

if then

endfor

return

2 ni2xx

mxx

)1( ik

mx

0

x

Power consumption depends

on value of the secret bit ki!

SPA – Symmetric Key Crypto (1)

January 28, 201671.

RAM CPU

COPROS

PE

RIP

HE

RA

LS

DIFFERENTIAL

POWER ANALYSIS

I-V characteristic of CMOS inverter

January 28, 201676.

Vi

VDD

ID

CMOS timing delays (𝒕𝒓, 𝒕𝒇, 𝒕𝑷𝑯𝑳, 𝒕𝑷𝑳𝑯, 𝒕𝑻𝑯𝑳, 𝒕𝑻𝑳𝑯)

January 28, 201677.

Vi Vo

VDD

Masking – XOR gate example

January 28, 201678.

x1

y1

x2

y2

z1

z2

x

yz

𝑥⨁𝑦 = 𝑥1⨁𝑥2 ⨁ 𝑦1⨁𝑦2 = 𝑥1⨁𝑦1 ⨁ 𝑥2⨁𝑦2 = 𝑧1⨁𝑧2 = 𝑧

Masking – AND gate example

January 28, 201679.

x

yz

x1

y1

x1

y2

x2

y2

x2

y1

z1

z2

z1

𝑥 ∙ 𝑦 = 𝑥1⨁𝑥2 ∙ 𝑦1⨁𝑦2

= 𝑧1 ⊕ 𝑧1 ⊕ (𝑥1 ∙ 𝑦1) ⊕ (𝑥1 ∙ 𝑦2) ⊕ (𝑥2 ∙ 𝑦1) ⊕ 𝑥2 ∙ 𝑦2

= 𝑧1⨁𝑧2 = 𝑧

Masking – AND gate example (delays cause leakage)

January 28, 201680.

x1

y1

x1

y2

x2

y2

x2

y1

z1

z2

z1

y1 y2 y switching

0 0 0 -

0 1 1 1 AND, 1 XOR

1 0 1 1 AND, 2 XOR

1 1 0 2 AND, 2 XOR

y = 0 => 2 AND, 2 XOR gates switching on average

y = 1 => 2 AND, 3 XOR gates switching on average

Masking with Sufficient Noise

slide credit:

Marcel Medwed

CORRELATION

POWER ANALYSIS

January 28, 201683.

Correlation Power Analysis

S-BOX

(8-bit)

plaintext (𝑝)

ciphertext (𝑐)key (𝑘)

𝑥𝑗𝑖 = 𝐻𝑊 𝑘𝑖 ⊕ 𝑝𝑗 0 < 𝑖 < 255, 0 < 𝑗 < 𝑛

𝑦𝑗 = 𝑓 𝐻𝑊 𝐾 ⊕ 𝑝𝑗 Measured current!

𝜌𝑥𝑦𝑖 =

𝑗=0𝑛−1 𝑥𝑗

𝑖 − 𝑥 𝑦𝑗 − 𝑦

𝑗=0𝑛−1 𝑥𝑗

𝑖 − 𝑥2

𝑗=0𝑛−1 𝑦𝑗 − 𝑦

2

# of traces

Pearson correlation

January 28, 201684.

Correlation Power Analysis

S-BOX

(8-bit)

plaintext (𝑝)

ciphertext (𝑐)key (𝑘)

𝑥𝑗𝑖 = 𝐻𝑊 𝑆𝐵𝑂𝑋 𝑘𝑖 ⊕ 𝑝𝑗

0 < 𝑖 < 255, 0 < 𝑗 < 𝑛

𝑦𝑗 = 𝑓 𝐻𝑊 𝑆𝐵𝑂𝑋 𝐾 ⊕ 𝑝𝑗 Measured current!

𝜌𝑥𝑦𝑖 =

𝑗=0𝑛−1 𝑥𝑗

𝑖 − 𝑥 𝑦𝑗 − 𝑦

𝑗=0𝑛−1 𝑥𝑗

𝑖 − 𝑥2

𝑗=0𝑛−1 𝑦𝑗 − 𝑦

2

# of traces

Pearson correlation

COST OF

COUNTERMEASURES

Security comes at Price – Area Overhead

January 28, 201686.

Insecure

X GE

SCA-secure

5X GE

SCA&FA-secure

10X GE

Security comes at Price – Performance Penalty

January 28, 201687.

Insecure

N s

SCA-secure

~3-5N s

SCA&FA-secure

~8-10N s

Challenges – SCA Countermeasures

January 28, 201688.

INCOMPLETE MODELS

Circuit

models

Adversary

models1st vs higher

order DPA

Challenges – FA Countermeasures

January 28, 201689.

LACK OF CREATIVITY

Redundant

executions

Dummy

operationsLight

sensors

THANK YOU!

January 28, 201690.

Thanks to the teams of

KATAN, SPONGENT, PRINCE, FIDES

Workshop on Crypto Design for IoT

January 28, 201691.

https://www.cosic.esat.kuleuven.be/ecrypt_net_iot_workshop_2016/

top related