Chapter 4 Hash Functions 1
Overview Cryptographic hash functions are functions that:
o Map an arbitrary-length (but finite) input to a fixed-size output
o Are one-way (hard to invert)o Are collision-resistant (difficult to find two values that
produce the same output) Examples:
o Message digest functions - protect the integrity of data by creating a fingerprint of a digital document
o Message Authentication Codes (MAC) - protect both the integrity and authenticity of data by creating a fingerprint based on both the digital document and a secret key
Chapter 4 Hash Functions 2
Checksums vs. Mess. Digests
Checksums:o Used to produce a compact representation of a messageo If the message changes the checksum will probably not
matcho Good: accidental changes to a message can be detectedo Bad: easy to purposely alter a message without changing
the checksum Message digests:
o Used to produce a compact representation (called the fingerprint or digest) of a message
o If the message changes the digest will probably not matcho Good: accidental changes to a message can be detectedo Good: difficult to alter a message without changing the
digest
Chapter 4 Hash Functions 3
Hash Functions Message digest functions are hash functions
o A hash function, H(M)=h, takes an arbitrary-length input, M, and produces a fixed-length output, h
Example hash function:o H = sum all the letters of an input word modulo 26o Input = a wordo Output = a number between 0 and 25, inclusiveo Example:
H(“Elvis”) = ((‘E’ + ‘L’ + ‘V’ + ‘I’ + ‘S’) mod 26) H(“Elvis”) = ((5+12+22+9+19) mod 26) H(“Elvis”) = (67 mod 26) H(“Elvis”) = 15
Chapter 4 Hash Functions 4
Collisions For the hash function:
o H = sum all the letters of an input word modulo 26
There are more inputs (words) than possible outputs (numbers 0-25)
Some different inputs must produce the same output
A collision occurs when two different inputs produce the same output:o The values x and y are not the same, but
H(x) and H(y) are the same
Chapter 4 Hash Functions 5
Collisions - Example H(“Jumpsuit”) = 25
o (‘J’ + ‘U’ + ‘M’ + ‘P’ + ‘S’ + ‘U’ + ‘I’ + ‘T’) mod 26o (10+21+13+16+19+21+9+20) mod 26o 129 mod 26o 25
H(“TCB”) = 25o (‘T’ + ‘C’ + ‘B’) mod 26o (20+3+2) mod 26o 25 mod 26o 25
Chapter 4 Hash Functions 6
Collision-Resistant Hash Functions
Hash functions for which it is difficult to find collisions are called collision-resistant
A collision-resistant hash function, H(M)=h:o For any message, M1o It is difficult to find another message, M2 such
that: M1 and M2 are not the same H(M1) and H(M2) are the same
Chapter 4 Hash Functions 7
One-Way Hash Functions A function, H(M)=h, is one-way if:
o Forward direction: given M it is easy to compute h
o Backward direction: given h it is difficult to compute M
A one-way hash function:o Easy to compute the hash for a given
messageo Hard to determine what message produced
a given hash value
Chapter 4 Hash Functions 8
Message Digest Functions Message digest functions are
collision-resistant, one-way hash functions:o Given a message it is easy to
compute its digesto Hard to find any message that
produces a given digest (one-way)o Hard to find any two messages that
have the same digest (collision-resistant)
Chapter 4 Hash Functions 9
Using Message Digest Functions
Message digest functions can be used to protect data integrity:o A company makes some software available for
download over the World Wide Webo Users want to be sure that they receive a copy that
has not been tampered witho Solution:
The company creates a message digest for its software The digest is transmitted (securely) to users Users compute their own digest for the software they
receive If the digests match the software probably has not been
altered
Chapter 4 Hash Functions 10
Attacks on Message Digests
Brute-force search for a collision:o Goal:
Find a message that produces a given digest, d
o Assume: The message digest function is “strong” The message digest function creates n-bit digests
o Approach: Generate random messages and compute digests
for them until one is found with digest d Approximately 2n random messages must be tried
to find one that hashes to d
Chapter 4 Hash Functions 11
Attacks on MDs (cont) Birthday attack (based on the birthday
paradox):o Goal:
Find any two messages that produce the same digest
o Assume: The message digest function is “strong” The message digest function creates n-bit digests
o Approach: Generate random messages and compute digests for
them until two are found that produce the same digest Approximately 2n/2 random messages must be tried to
find one that hashes to d
Chapter 4 Hash Functions 12
The Secure Hash Algorithm
The Secure Hash Algorithm:o A Federal Information Processing Standard
(FIPS 180-1) adopted by the U.S. government in 1995
o Based on a message digest function called MD4 created by Ron Rivest
o Developed by NIST and the NSAo Input: a message of b bitso Output: a 160-bit message digest
Chapter 4 Hash Functions 13
SHA - Padding Input: a message of b bits
o Padding makes the message length a multiple of 512 bits
o The input is always padded (even if its length is already a multiple of 512)
Padding is accomplished by appending to the input:o A single bit, 1o Enough additional bits, all 0, to make the final 512-
bit block exactly 448 bits longo A 64-bit integer representing the length of the
original message in bits
Chapter 4 Hash Functions 14
SHA – Padding Example Consider the following message:
o M = 01100010 11001010 1001 (20 bits) To pad we append:
o 1 (1 bit)o 427 0s (427 bits)o 64-bit binary representation of the number 20 (64
bits) Result:
o Pad(M) = 01100010 11001010 10011000 00000000 . . . 00000000 00010100 (512 bits)
o 464 0s have been omitted above (denoted by the ellipsis)
Chapter 4 Hash Functions 15
SHA – Constant Init. After padding, constants are initialized to the
following hexadecimal values:o Five 32-bit words:
H0 = 67452301 H1 = EFCDAB89 H2 = 98BADCFE H3 = 10325476 H4 = C3D2E1F0
o Eighty 32-bit words: K0 – K19 = 5A827999 K20 – K39 = 6ED9EBA1 K40 – K59 = 8F1BBCDC K60 – K79 = CA62C1D6
Chapter 4 Hash Functions 16
SHA – Step 1 The padded message contains a whole
number of 512-bit blocks, denoted B1, B2, B3, . . ., Bn
Each 512-bit block, Bi, of the padded message is processed in turn:o Bi is divided into 16 32-bit words, W0, W1, . . .,
W15
W0 is composed of the leftmost 32 bits in Bi
W1 is composed of the second 32 bits in Bi
… W15 is composed of the rightmost 32 bits in Bi
Chapter 4 Hash Functions 17
SHA – Step 2 W0, W1, . . ., W15 are used to compute 64 new
32-bit words (W16, W17, . . ., W79) Wj (16 < j < 79) is computed by:
o XORing words Wj-3, Wj-8, Wj-14, and Wj-16 togethero Circularly left shifting the result one bit
for j = 16 to 79do
Wj = Circular_Left_Shift_1(Wj-3 Wj-8 Wj-14 Wj-16)done
Chapter 4 Hash Functions 18
SHA – Step 3 The values of H0, H1, H2, H3, and H4 are
copied into five words called A, B, C, D, and E:o A = H0
o B = H1
o C = H2
o D = H3
o E = H4
Chapter 4 Hash Functions 19
SHA – Step 4 Four functions are defined as follows:
o For (0 < j < 19): fj(B,C,D) = (B AND C) OR ((NOT B) AND D)
o For (20 < j < 39): fj(B,C,D) = (B C D)
o For (40 < j < 59): fj(B,C,D) = ((B AND C ) OR (B AND D) OR (C AND
D))
o For (60 < j < 79): fj(B,C,D) = (B C D)
Chapter 4 Hash Functions 20
SHA – Step 4 (cont) For each of the 80 words, W0, W1, . . ., W79, a
32-bit word called TEMP is computed The values of the words A, B, C, D, and E are
updated as shown below:
for j = 0 to 79do
TEMP = Circular_Left_Shift_5(A) + fj(B,C,D) + E + Wj + Kj
E = D; D = C; C = Circular_Left_Shift_30(B); B = A; A = TEMPdone
Chapter 4 Hash Functions 21
SHA – Step 5 The values of H0, H1, H2, H3, and H4, are
updated:
o H0 = H0 + Ao H1 = H1 + Bo H2 = H2 + Co H3 = H3 + Do H4 = H4 + E
Chapter 4 Hash Functions 22
SHA - Overview Pad the message Initialize constants For each 512-bit block (B1, B2, B3, . . ., Bn):
o Divide Bi into 16 32-bit words (W0 – W15)o Compute 64 new 32-bit words (W16, W17, . . ., W79)o Copy H0 - H4 into A, B, C, D, and Eo For each Wj (W0 – W79) compute TEMP and update A-Eo Update H0 - H4
The 160-bit message digest is: H0 H1 H2 H3 H4
Chapter 4 Hash Functions 23
Motivation for Message Authentication Codes
Want to use a message digest function to protect files on our computer from viruses:o Calculate digests for important files and store them in a
tableo Recompute and check from time to time to verify that the
files have not been modified Good: if a virus modifies a file the change will be
detected since the digest of that file will be different Bad: the virus could just compute new digests for
modified files and install them in the table
Chapter 4 Hash Functions 24
Message Authentication Codes
A message authentication code (MAC) is a key-dependent message digest functiono MACK(M) = h
The output, h, is a function of both the hash function and a key, K
The MAC can only be created or verified by someone who knows K
Can turn a one-way hash function into a MAC by encrypting the hash value with a symmetric-key cryptosystem
Chapter 4 Hash Functions 25
Using MAC MAC can be used to protect data integrity
and authenticity:o Want to use a MAC to protect files on our
computer from viruses: Calculate MAC values for important files and store
them in a table Recompute and check from time to time to verify that
the files haven’t been modified
o Good: if a virus modifies a file the hash of that file will be different
o Good: virus doesn’t know the proper key so it can’t install new MACs in the table to cover its tracks
Chapter 4 Hash Functions 26
Implementing a MAC Can use a block cipher algorithm:
o Pad the message (if necessary) so that its length is a multiple of the cipher’s block size
o Divide the message into n blocks equal in length to the cipher’s block size:
m1, m2, . . ., mn
o Choose a key, ko Encrypt m1 with ko XOR the result with m2
o Encrypt the result with ko XOR the result with m3
…
Chapter 4 Hash Functions 27
Implementing a MAC (cont)
Chapter 4 Hash Functions 28
Summary Message digests
o Message digest functions are collision-resistant, one-way hash functions
Collision-resistant: hard to find two values that produce the same output
One-way: hard to determine what input produced a given output
o Protects the integrity of a digital document MAC
o A message authentication code is a key-dependent message digest function
The output is a function of both the hash function and a secret key
The MAC can only be created or verified by someone who knows the key
o Protects the integrity and authenticity of a digital document