people.uwplatt.edupeople.uwplatt.edu/~yangq/csse411/csse411-materials/s12... · web...

21
Cryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville E-mail: [email protected] Abstract Man has always had the need to hide and keep certain information secret. With that need came those who wish to know and obtain such secrets. Cryptography (data encryption) was created to hide information, but with that encryption came its counterpart, cryptanalysis (data decryption). This paper examines the history from the past with WWII and the Enigma, to the present with different types and styles of cryptanalysis, and different methods used in conjunction with the presented types. This paper primarily discusses cryptanalysis and the study of cipher decryption, however without its counterpart cryptography, cryptanalysis would not exist. Cryptanalysis and cryptography, together, comprise the field called cryptology. It should be noted that some people refer to cryptology as cryptography, instead of using cryptography to only mean the study of encryption. Terminology Listed below are terms and definitions which will be used frequently throughout this discussion. Cryptology - [krip-tol-uh-jee] the science and study of cryptanalysis and cryptography .

Upload: others

Post on 29-Dec-2019

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis

Aaron D. WillettDepartment of Computer Science

University of Wisconsin-PlattevilleE-mail: [email protected]

Abstract

Man has always had the need to hide and keep certain information secret. With that need came those who wish to know and obtain such secrets. Cryptography (data encryption) was created to hide information, but with that encryption came its counterpart, cryptanalysis (data decryption). This paper examines the history from the past with WWII and the Enigma, to the present with different types and styles of cryptanalysis, and different methods used in conjunction with the presented types. This paper primarily discusses cryptanalysis and the study of cipher decryption, however without its counterpart cryptography, cryptanalysis would not exist. Cryptanalysis and cryptography, together, comprise the field called cryptology. It should be noted that some people refer to cryptology as cryptography, instead of using cryptography to only mean the study of encryption.

Terminology

Listed below are terms and definitions which will be used frequently throughout this discussion.

Cryptology - [krip-tol-uh-jee] the science and study of cryptanalysis and cryptography.

Cryptanalysis - [krip-tuh-nal-uh-sis] the procedures, processes, methods, etc., used to translate or interpret secret writings, as codes and ciphers, for which the key is unknown.

Cryptography - [krip-tog-ruh-fee] the procedures, processes, methods, etc., of making and using secret writing, as codes or ciphers.

Cryptosystem - [krip-toh-sys-tuhm] a system for encoding and decoding secret messages.

Cipher - [sahy-fer] a method of secret writing using substitution or transposition of letters according to a key.

Code – [kohd] a system used for brevity or secrecy of communication, in which arbitrarily chosen words, letters, or symbols are assigned definite meanings.

Page 2: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 2

Plaintext - [pleyn-tekst] the intelligible original message of a cryptogram, as opposed to the coded or enciphered version.

Encryption – the process of transforming plaintext using a cipher to make it unreadable to anyone except those possessing special knowledge.

Decryption – the reverse process of encryption, i.e. to make the encrypted information readable again

Crib - [Krib] known samples of plaintext

Monoalphabetic Substitution - An encryption method in which each occurrence of a character is replaced by another character in the set.

Polyalphabetic Substitution - An encryption method in which each occurrence of a character can have a different substitute.

Alice, Bob, and Mallory - Placeholder names used commonly in the field of cryptology. Alice refers to "Party A", Bob refers to "Party B", and Mallory refers to a "Malicious Attacker."

History

Early Cryptology

Cryptography and cryptanalysis can be traced back as far as the ancient Greeks; in particular, the Spartans. One of the first recorded devices used to encrypt plaintext is called a “Scytale” (sounds like Italy). Essentially, a scytale is two rods of an equal diameter. These rods are basically a crude private key. When the sender wanted to encrypt a message, a thin strip of parchment was wound around the length of the scytale. The sender then proceeded to write their message across the wrapped parchment. When the parchment is unraveled, the message looked like a jumbled mess. The message was then sent - without the rod - to the receiving party. When the message arrived in the receiver's hands, they decoded the message by wrapping it around their own scytale. Both the sending and receiving scytale had to be the same diameter, otherwise the message would still look like garbage. This method is called a transposition cipher. While the scytale had its early uses, it was an incredibly weak system of encryption and could be quickly decrypted by an intercepting party. Not long after, a new method of encryption was introduced by Julius Caesar. This method of data encryption is still used today and is called a Caesar Cipher. It can also sometimes be known as a substitution cipher. To obtain the ciphertext, Caesar took his plaintext and shifted each letter three positions to the right in the standard alphabet. Several hundred years later, the concept of a substitution cipher was made more sophisticated. The Vigenere Square uses multiple substitution ciphers, one being the alphabet and the other being a key, to further encrypt plaintext.

Page 3: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 3

Cryptology Circa World War II

Cryptography has always been an important strategy for commanders in wartime. The ability to send messages to allies, unbeknownst to their enemies, has been the Holy Grail for many countries. For centuries, cryptanalysts have worked to break whatever coded messages they were given. Cryptanalysis was stunted with the introduction of the Enigma machine in World War II. The German Enigma Machine used multiple rotors to scramble and produce a polyalphabetic substitution ciphertext which proved to be extremely difficult for the Allied powers to decrypt. Despite the overwhelming task of decrypting the Engima, the Polish and the British developed a machine called a Bombe which was up to the challenge.

Some of the greatest achievements in cryptanalysis were derived from the Polish and the British breaking the Enigma and US Intelligence breaking the Japanese Red, Orange, and Purple ciphers.

Types

There are four primary types of cryptographic attack methods. Two methods utilize the plaintext and the other two make use of ciphertext. The two which use the ciphertext are ciphertext-only and chosen-ciphertext attacks. Likewise, the two plaintexts methods are known-plaintext and chosen-plaintext. Each will be discussed in further detail below.

Ciphertext-Only Attack

A ciphertext-only attack is an attack model of cryptanalysis where the attacker has access to a set of ciphertexts. Since only the ciphertext is available, this method requires that the attacker guess segments of plaintext that may or will be associated [5]. A common method used in ciphertext-only attacks is frequency analysis. Frequency analysis counts the number of occurrences of letters in the ciphertext, finds their frequency, and then compares them to the frequency of the letter in the language of the plaintext.

Page 4: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 4

Figure 1: English Language Letter Frequency

Figure 1 shows the frequency of the letters in the English alphabet. It is evident that the letter 'e' is the most commonly used letter, having a frequency of 12 percent. Using this information, we can analyze the ciphertext, find the most frequent letter, and make an educated guess that the given letter could represent an 'e' in the plaintext.

Example 1

Plaintext: HEREISANEXAMPLEOFAFREQUENCYANALYSISCiphertext: XTJTWICNTDCQLSTMOCOJTKGTNYBCNCSBIWI

Figure 2: Frequency Analysis of Ciphertext

Figure 2 displays the frequency of the letters that are reflected in the ciphertext. In this illustration, the letter 'T' occurs in 17.14 percent of the full ciphertext. By analyzing this graph, it is possible to make an educated guess that the letter 'T' translates to the letter 'E' in the plaintext. These frequencies don't always match up to the common frequencies of the English language, so some trial-and-error is required to solve for the exact plaintext.

Page 5: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 5

Appendices 1, 2, and 3 illustrate my attempt at trying a ciphertext-only attack on ciphertext. I did not know what kind of encryption was used, only that it was classical cryptography (Substitution, Caesar, Transposition, Scytale, etc.) and not modern (AES, DES, RC2, RC4, etc.). A frequency analysis told me that the ciphertext was not any sort of transposition cipher because the letter 'e' was not the highest percent. That led me to believe the encryption was either a Caesar or a substitution cipher. Appendix three shows all possible Caesar ciphers. No word or partial word was decrypted which meant the plaintext was encrypted using a substitution cipher. By looking for patterns and comparing them against the frequency analysis, I was able to decrypt the word 'the' and the encrypted letters associated with the 't', 'h', and 'e'. At that point, decryption became more difficult, but I got lucky when I discovered the word 'substitution' worked. From there I was easily able to decode the rest of the ciphertext. If I had the plaintext word 'substitution', I would have been able to decipher the text faster. This example leads to an explanation of the known-plaintext attack.

Known-Plaintext Attack

The known-plaintext attack is an attack model for cryptanalysis where the attacker has samples of both the plaintext and its encrypted ciphertext [9]. This increases the probability of success significantly. Applying a known-plaintext word or phrase to the ciphertext alphabet helps the attacker decode the rest. It allows for patterns to be revealed in the ciphertext which leads to the decryption of common words.

Example 2Plaintext: This is encrypted using a ROT3 Caesar CipherCiphertext: Wklv lv hqfubswhg xvlqj d URW3 Fdhvdu FlskhuKnown-Plaintext: Caesar

By examining the cipher text in Example 2, we can deduce that our known-plaintext word 'Caesar' relates to the ciphertext 'Fdhvdu'. From that, we can relate the letters 'C' to 'F', 'A' to 'D', 'E' to 'H' and so forth. With a little bit of guess-work and some trial-and-error, you can infer that a Caesar Cipher with a rotation of 3 was used to encrypt the plaintext. It's easy to see how this method is much more preferred than a ciphertext-only attack.

Chosen-plaintext Attack

A chosen-plaintext attack is presumes that the attacker has the capability to choose arbitrary plaintexts to be encrypted and obtain the corresponding ciphertexts [4]. In other words, the attacker chooses whatever plaintext they want and must be able to feed this plaintext into the cryptosystem and receive the results. Once they have the resulting ciphertext, they can then begin to figure out how the cryptosystem encrypts the plaintext. This method is highly dependent on whether or not the attacker can input their own plaintext and receive the results. For example, in World War II, the allies were not able to use their chosen-plaintext method on the Enigma until they captured an Enigma of their own. In the case of classical cryptography, this method would

Page 6: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 6

make decryption of the ciphertext extremely easy due to the simplicity of most ciphers. There are two ways to do a chosen-plaintext attack:

A batch chosen-plaintext attack is where the cryptanalyst chooses a "batch" of plaintexts before any of them are encrypted and then encrypts them all at once.

An adaptive chosen-plaintext attack is where the cryptanalysts make n-amounts of interactive queries and alters their plaintexts based on the results of those interactive queries.

Chosen-ciphertext Attack

A chosen-ciphertext attack a model in which the cryptanalyst gathers information, at least in part, by choosing a ciphertext and obtaining its decryption under an unknown key [3]. In other words, a cryptanalyst using this method feeds their chosen ciphertext into the system and receives the resulting plaintext. This method is the exact opposite of a chosen-plaintext attack. Both systems rely on the attacker being able to access the information once their chosen texts are fed into the cryptosystem. The purpose of these attacks is to be able to identify the key that is used to encrypt the data. There are two variations of chosen-ciphertext attacks:

The first variation is the "Lunchtime attack." This term is based off of the idea that the computer, with the ability to decrypt, is available to the cryptanalyst while the user is "away on lunch." This means the attacker can make queries until the results become useless or the cryptanalyst gets locked out of the system.

The second variation is an adaptive chosen-ciphertext attack. Similar to the adaptive chosen-plaintext attack, the cryptanalyst chooses each ciphertext based on the resulting query of the previous ciphertext.

Methods

Classical Cryptanalysis

Cryptanalysts implement three methods to break classical ciphers. These methods are the index of coincidence, Kasiski Examination, and frequency analysis. When decrypting ciphertext, a good starting point is to find the length of the key used for encryption. The index of coincidence is a useful method to find the key length for ciphers such as the Vigenère or Beaufort ciphers which use repeating keys. The cryptanalyst repeatedly shifts the ciphertext one position to the left or right until the ciphertext has been converted to its original state. The cryptanalyst then compares the original ciphertext to each of the shifted positions, locates letters which are identical for each position, and records each occurrence. Shifts in either direction of a multiple of the length of the key typically yield high frequencies of coincidences. For example, if there are

Page 7: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 7

high amounts of coincidences at shifts two and six, it could be suggested that the length of the key is either two or six, however not having a high number of coincidences at four suggests the latter.

The Kasiski Examination process is similar to finding the index of coincidences. For the Kasiski Examination, the cryptanalyst has to find repeated sequences of letters in the ciphertext which are of three or more characters in length, and then record the distance between the occurrences. Once all of the distances between all repeated sequences are recorded, the cryptanalyst can attempt to discover a common factor between them which-should lead to the length of the key.

As discussed above, frequency analysis is the comparison of the frequency of letters occurring in the ciphertext to the frequency of the letters in the English language.

Page 8: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 8

Symmetric Algorithms

There are a variety of symmetrical algorithms available. The list below identifies a variety of different symmetric algorithms. I will be discussing some of the more popular ones.

Boomerang Attack Brute Force Attack Davies' Attack Differential Cryptanalysis Integral Cryptanalysis Linear Cryptanalysis Meet-in-the-middle Attack Mod-n Cryptanalysis Slide Attack XSL Attack

A Brute Force Attack is a method which runs trials against the cryptosystem for all possible keys. This method is usually used when the cryptosystem does not present any weaknesses. As the size of the key increases, the feasibility of using a brute force attack exponentially decreases.

Table 1 displays the time to run a brute force attack in relation to the length of the key.

Table 1: Brute Force [2]

Symmetric key length vs brute-force combinations

Key size in bits Permutations Brute-force time for a device checking   permutations per second

8 <1 nanosecond

40 0.015 milliseconds

56 1 second

64 4 minutes 16 seconds

128 149,745,258,842,898 years

256 50,955,671,114,250,072,156,962,268,275,658,377,807,020,642,877,435,085 years

Differential, Integral, and Linear Cryptanalysis are all useful against block and stream ciphers. Differential cryptanalysis is a chosen-plaintext attack. This means the attacker must be able to feed their own plaintexts into the cryptosystem and be able to receive the corresponding

Page 9: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 9

ciphertexts. To perform a differential attack, the cryptanalyst feeds plaintext pairs into the cryptosystem and analyzes the difference defined by the XOR (exclusive OR) operation. Ideally, the received ciphertexts will reveal a statistical pattern in their distribution. Specialized types of differential cryptanalysis are higher-order differential cryptanalysis, truncated differential cryptanalysis, impossible differential cryptanalysis, and the boomerang attack.

An integral cryptanalysis attack is very similar to a differential attack, but varies in the plaintext sent to the cryptosystem. While differential attacks use pairs of plaintext, integral attacks use sets or multi-sets of plaintext where a certain portion of each plaintext remains the same, with the rest of the plaintext differing throughout all possibilities. The XOR sum of a set usually equates to zero.

Linear cryptanalysis is based on finding affine approximations to the action of a cipher [10]. Linear cryptanalysis develop equations that relate the plaintext and the ciphertext to certain key bits. These equations, along with the known-plaintext and ciphertext pairs, would be used to derive other key bits. The procedure is repeated until the number of unknown key bits is low enough that the cryptanalyst can use a brute force attack on the rest of the cryptosystem.

Hash Functions

Hash functions are more common in computer science. A hash function is simply a function which maps sets of data, typically called keys, to different sets of data of a fixed length. For example, the letter 'a' in the alphabet could be mapped to the integer 1, 'b' to 2, etc. There are two primary forms of attacks in cryptanalysis to crack hash function: the birthday attack and the rainbow table.

A birthday attack exploits the birthday paradox in probability theory. "As an example, consider the scenario in which a teacher with a class of 30 students asks for everybody's birthday, to determine whether any two students have the same birthday (corresponding to a hash collision as described below; for simplicity, ignore February 29). Intuitively, this chance may seem small. If the teacher picked a specific day (say September 16), then the chance that at least one student was born on that specific day is , about 7.9%. However, the probability that at least one student has the same birthday as any other student is nearly 70% (using the formula

 )" [1].

In The goal of a birthday attack is to choose two arbitrary inputs that yield the same hash value (also known as a hash collision). This is particularly useful when an attacker wants to modify a document that has a digital signature.

Example 3:Mallory creates two documents X and Y, which have the same hash value. Mallory sends document X to Alice who verifies the document, signs it, and returns it to Mallory. Mallory then copies the digital signature from document X to document Y and sends document Y to Bob. The

Page 10: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 10

signature on document Y matches the document hash so Bob's software cannot detect any modifications done to the document.

Another method is the rainbow table. This is a pre-computed table for reversing hash functions. A rainbow table is typically used for cracking password hashes to receive the plaintext password. The pre-computed table holds common or short passwords. Once a hash value is found it can be looked up in the table to find a matching password. This method becomes infeasible when the size of the password increases or if the use of a "salt" is implemented. A salt consists of randomly computed bits that are concatenated on to the end of the password to increase its strength. Adding a salt allows for two users to have the same plaintext password, but completely different hash values.

External Attacks

In cryptanalysis, there are some more callous methods of obtaining information.* Two of these are black-bag cryptanalysis and rubber-hose cryptanalysis. Black-bag cryptanalysis is a euphemism for the acquisition of cryptographic secrets through theft. This method can be employed by key-logging software, trojan horses, or any other sort of malicious software. Even the act of breaking into the victim's house and copying down a password recklessly left on a desk is considered a method of black-bag cryptanalysis. In rubber-hose , the cryptographic secrets are obtained from the person by use of coercion or torture. Many countries routinely use both of these methods to obtain information because, in most cases, the weakest link is human. Due to the lack of subtlety, attackers may forego such tactics in favor of more covert, mathematical counterparts.

*The author of this paper does not condone any of these external methods of cryptanalysis

Conclusion

Cryptography has been around for centuries and as a result, so has cryptanalysis. Essentially, they are two sides of the same coin and will be in a perpetual state of competition. When cryptographers develop more secure ciphers, cryptanalysts will be on the other side, developing stronger algorithms to decrypt the ciphers. Already, there are many forms of cryptanalysis to use for the increasing number of situations. Some may be old, but are still widely used. From classical ciphers like the Scytale to the modern ciphers like RSA, cryptanalysts will continually try to find ways to crack the code. Through methods like rubber-hose cryptanalysis, not even your mind is safe.

Page 11: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 11

References

[1] Birthday Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Birthday_attack

[2] Brute Force Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Brute_force_attack

[3] Chosen-Ciphertext Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Chosen-ciphertext_attack

[4] Chosen-Plaintext Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Chosen-plaintext_attack

[5] Ciphertext-Only Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Ciphertext-only_attack

[6] Cryptanalysis (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Cryptanalysis#Quantum_computing_applications_for_crypta

nalysis

[7] Cypher Research Laboratories. (2006, January 24). History of Cryptography. Retrieved fromhttp://www.cypher.com.au/crypto_history.htm

[8] Gaines, Helen Fouche. (1956). Cryptanalysis - A Study of Ciphers and Their Solutions. New York, NY: Dover Publications, Inc.

[9] Known-Plaintext Attack (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Known-plaintext_attack

[10] Linear Cryptanalysis (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Linear_cryptanalysis

[11] Murk. (2004, October 16). Index of Coincidences - A Worked Example. Retrieved fromhttp://www.murky.org/blg/2004/10/index-of-coincidences-a-worked-example/

[12] Pincock, Stephen. (2006). Codebreaker. New York, NY: Walker Publishing Company, Inc.

[13] Rainbow Tables (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Rainbow_table#Rainbow_tables

[14] Scytale (n.d.). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Scytale

Page 12: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 12

Appendix

Appendix1

Page 13: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 13

Appendix2

Page 14: people.uwplatt.edupeople.uwplatt.edu/~yangq/CSSE411/csse411-materials/s12... · Web viewCryptanalysis Aaron D. Willett Department of Computer Science University of Wisconsin-Platteville

Cryptanalysis 14

Appendix3