cryptanalysis through cache address leakage eran tromer adi shamir dag arne osvik
Post on 15-Jan-2016
215 views
TRANSCRIPT
Cryptanalysis through Cache Address Leakage
Cryptanalysis through Cache Address Leakage
Eran TromerAdi Shamir
Dag Arne Osvik
Eran TromerAdi Shamir
Dag Arne Osvik
2
Cache attacks
• Pure software
• No special privileges
• No interaction with the cryptographic code (some variants)
• Very efficient (e.g., full AES key extraction from Linux encrypted partition in 65 milliseconds)
• Compromise otherwise well-secured systems(e.g., see NIST AES process)
• “Commoditize” side-channel attacks: easily deployed software breaks many common systems
3
CPU core60% (until recently)
Main memory7-9%
Why cache analysis?
cacheAnnual speedincrease:
Typicallatency:
50-150ns0.3ns → timing gap
4
Address leakage
• The cache is a shared resource:cache state affects, and is affected by, all processes, leading to crosstalk between processes.
• The cached data is subject to memory protection…
• But the metadata leaks information about memory access patterns: which addresses are being accessed.
5
Related works on cache attacks
• [Hu ‘92] – Convert channels
• [Page ’02] – Theoretical attacks via power trace
• [Tsunoo Tsujihara Minematsu Miyuachi ’02],[Tsunoo Saito Suzaki Shigeri Miyauchi ’03] – Timing attacks via internal collisions
• [Bernstein ’04] – Model-less timing attacks
• [Percival ’05] – RSA
Different setting, further assumptions, less practical, and/or attack a different ciphers.
This the first work to fully demonstrate an efficient cache attack on a modern block cipher in real-life settings.
6
1.Eavesdropping
2.Cryptanalysis
3.Experimental results
4.Implications
7
Associative memory cacheD
RA
Mca
che
memory block(64 bytes)
cache line
(64 bytes)
cache set
(4 cache lines)
8
S-box tables in memoryD
RA
Mca
che
S-box
table
9
Detecting access to AES tables (basic idea)
DR
AM
cach
e
Attacker
memory
S-box
table
10
Measurement technique
Inter-process crosstalk can be exploited in two ways:
• Effect of the cache on the encryption (timing)
• Effect of the encryption on the cache
11
DR
AM
cach
e
T0Attacker
memory
1. Make sure the tables are cached
2. Evict one cache set
3. Time an encryption and see if it’s slow
Measuring effect of cache on encryption
12
Measurement technique
Inter-process crosstalk can be exploited in two ways:
• Effect of the cache on the encryption (timing)
• Effect of the encryption on the cache
13
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory 1. Completely
evict tables from cache
S-box
table
14
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory 1. Completely
evict tables from cache
2. Trigger a single encryption
S-box
table
15
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory 1. Completely
evict tables from cache
2. Trigger a single encryption
3. Access attacker memory again and see which cache sets are slow
S-box
table
16
Advantages of second method
• Yields more information (64) from a single encryption
• Not a timing attack!Insensitive to timing variance in enryption code path (crucial for effective attacks on complicated systems).
• No real need to trigger the encryption – can wait until it happens by itself…
17
1.Eavesdropping
2.Cryptanalysis
3.Experimental results
4.Implications
18
char p[16], k[16]; // plaintext and keyint32 T0[256],T1[256],T2[256],T3[256]; // lookup tablesint32 Col[4]; // intermediate state
...
/* Round 1 */
Col[0] T0[p[ 0]©k[ 0]] T1[p[ 5]©k[ 5]] T2[p[10]©k[10]] T3[p[15]©k[15]];
Col[1] T0[p[ 4]©k[ 4]] T1[p[ 9]©k[ 9]] T2[p[14]©k[14]] T3[p[ 3]©k[ 3]];
Col[2] T0[p[ 8]©k[ 8]] T1[p[13]©k[13]] T2[p[ 2]©k[ 2]] T3[p[ 7]©k[ 7]];
Col[3] T0[p[12]©k[12]] T1[p[ 1]©k[ 1]] T2[p[ 6]©k[ 6]] T3[p[11]©k[11]];
A typical software implementation of AES
lookup index = plaintext key
19
Variant 1: Synchronous attack
• A software service performs AES encryption using a secret key.
• An attacker process runs on the same CPU.
• The attacker process can somehow invoke the service on known plaintext.
• Examples:
— Encrypted disk partition + filesystem
— IP/Sec, VPN
→ the attacker can discover the secret key.
or decryption, or MAC
or ciphertexts
or trigger and
time memory
accesses
remotely
20
Synchronous attack on AES: Overview
• Measure (possibly noisy) cache usage of many encryptions of known plaintexts.
• Guess the first key byte. For each hypothesis:
— For each sampled plaintext, predict which cache line is accessed by “T0[p[ 0]©k[ 0]]”
• Identify the hypothesis which yields maximal correlation between predictions and measurements.
• Proceed for the rest of the key bytes.
• Practically, a few hundred samples suffice.
Got 64 bits of the key (high nibble of each byte)!
• Use these partial results to mount attack further AES rounds, exploiting S-box nonlinearity.
A few thousand samples for complete key recovery.
21
Synchronous 1R attack (idealized)
Suppose we could answer queries of the form:
“Does encryption of p under the unknown key k access the memory block of index y in table Tl?”
Denote this predicate Q(p,l,y){0,1}.
The approach:
Obtain many samples of the predicate
”Guess” part of the key
Deduce constraints on value of predicateCheck if fulfilled in all samples
22
Synchronous 1R attack (cont.)
• Consider one of the memory accesses in the first round: “T0[p[0]©k[0]]”
• Given a candidate value k’[0] and samples of Q(p,l,y):
—The useful samples are those that fulfill p[0]©k’[0]y
—If k’[0]k[0] then for all useful samples: p[0]©k[0] p[0]©k’[0] y so T0[p[0]©k[0]] accesses y hence Q(p,l,y)=1
—Otherwise: p[0]©k[0] p[0]©k’[0] y hence Q(p,l,y)=0
(but there are 35 more “random” accesses to T0…)with probability (1-1/16)350.104
• Hence, a few hundred random samples suffice to eliminate all bad candidates find high nibble of all key bytes
23
Full key extraction
• We got half the key (64 bits)
• Too much for brute force
• We extracted all possible information from 1st-round accesses
• Move to 2nd round, taking advantage of the non-linearity of the S-box
24
Synchronous 2R attack (idealized)
• After expansion of the AES key schedule and 1st round, we see that the 2nd round does the following lookup:
“T2[ S[p[0]©k[0]] 2S[p[5]©k[5]] 3S[p[10]©k[10]] S[p[15]©k[15]] S[k[15]] k[2] ]”
• The only relevant unknowns are the low nibbles of k[0], k[5], k[10], k[15] (216 candidates).
• Can test a candidate as before:—Predict this lookup according to guess
—Identify useful samples, i.e., those where y is in the same memory block as the prediction
—Check whether Q(p,l,y)=1 for all useful samples.
• There are 3 more accesses of this special form, with disjoint sets of relevant low nibbles.
• Hence: full key recovery using ~2000 random samples.
25
Scenario 2: asynchronous attack
• Someone runs encryptions computations using a secret key.
• An attacker process runs on the same CPU at (roughly) the same time.
• The plaintext/ciphertext has a non-uniform (conditional) distribution:
— English
— Formatted data
— Headers
— Ciphertext gleaned from wire
• Examples: just about any use of crypto on a multi-user system
→ attacker can (partially?) discover the secret key
26
Asynchronous attack (basic idea)
Compare two distributions:
• Measured memory accesses statistics.
• Predicted memory accesses statistics,under the given plaintext distribution and the key hypothesis.
Find key that yields best correlation.
27
“Hyper Attack”
• Obtaining parallelism:
— HyperThreading (simultaneous multithreading)
— Multi-core, shared caches, cache coherence
— Interrupts
• Attack model:
— Encryption process is not communicating with anyone (no I/O, no IPC).
— No special measurement equipment
— No knowledge of either plaintext of ciphertext
28
1.Eavesdropping
2.Cryptanalysis
3.Experimental results
4.Implications
29
Experimental results
• Synchronous attack on OpenSLL AES encryption library call:
Full key recovery by analyzing 300 encryptions (13ms)
• Synchronous attack on an AES encrypted filesystem (Linux dm-crypt):
Full key recovery by analyzing 800 write operations (65ms)
• Asynchronous attack on AES(independent process doing batch encryption of text):
Recovery of 45.7 key bits in one minute.
30
Experimental example: synchronous attack I
Measuring a “black box” OpenSSL encryption on Athlon 64, using 10,000 samples.Horizontal axis: evicted cache setVertical axis: p[0] (left), p[5] (right)Brightness: encryption time (normalized)
k[0]=0x00 k[5]=0x50
31
Experimental example: synchronous attack II
Measuring a Linux 2.6.11 dm-crypt encrypted filesystem with ECB AES on Athlon 64, using 30,000 samples.Horizontal axis: evicted cache setVertical axis: p[0]Brightness: cache probing time (normalized)
Left: raw. Right: after subtracting cache set average.
32
Finding the tables
Measuring a Linux 2.6.11 dm-crypt encrypted filesystem with ECB AES on Athlon 64, using 30,000 samples.Horizontal axis: candidate table offsetCandidate key byte value: k’[0] (left), k’[5] (right)Brightness: score
k[0]=0x00 k[5]=0x50
33
Experimental example: asynchronous attack
34
1.Eavesdropping
2.Cryptanalysis
3.Experimental results
4.Implications
35
Implications and Extensions
• Multiuser systems
• Virtual machines— Examples: jail(), Xen, UML, VMware, Virtual PC
— Breaks NSA US Patent 6,922,774
— No guarantees in AMD ”Secure Virtual Machine” orIntel “Virtualization Technology” (VMX) specs either
• DRMThe trusted path is leaky (even if verified by TPM attestation, NGSCB, etc.).
• Untrusted code (e.g., ActiveX, Java applets, managed .NET, JavaScript)
• Remote network attacks
36
Countermeasures
Note: With sufficient temporal resolution, the attacker gets a fair approximation of the full memory access transcript.
Secure
Efficient Generic?
[GO95]no cache
[ZZLP04]
[MR04]
bitslice
OS mode
ignore
[Intel05]
37
Conclusions
New side channel attacks that are
• cheap
• easy to exploit
• hard to detect
• expensive to avoid
• applicable to many deployed systems