csci 654 - foundations of parallel programming team ...ark/654/team/4/report.pdf · approach would...

22
CSCI 654 - Foundations of Parallel Programming Team ThreadRippers Final Project Report Password Cracking using Rainbow Tables Omkar Kakade Indrajeet Vidhate <[email protected]> <[email protected]> 1

Upload: others

Post on 02-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

CSCI 654 - Foundations of Parallel Programming  

Team ThreadRippers  

Final Project Report  

Password Cracking using Rainbow Tables   

Omkar Kakade Indrajeet Vidhate  <[email protected]> <[email protected]>

1

Page 2: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Index  Overview 3 Computational Problem 4 Analysis of Research Paper - 1 5 Analysis of Research Paper - 2 6 Analysis of Research Paper - 3 7 Design and Operation of Sequential Program 8 Design and Operation of Parallel Program 9 Developer’s Manual ( Compile ) 10 User’s Manual 11 Strong Scaling Performance Data 12 Weak Scaling Performance Data 13 Possible Future Work 14 Learnings 15 Individual Contribution 16 References 17     

2

Page 3: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

1. Overview  Passwords have been long used as a way to provide authenticated access to a system. Since the security of passwords themselves is a critical element of their usefulness, they cannot be stored as plain texts. The passwords are hence stored as cryptographic hashes on a system. The goal of this project is to crack these hashes. The traditional way a password cracking program would do this would be to use techniques such as brute forcing or dictionary based attacks which would try all possible combinations of the character set that it is trying to crack. A novel approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a table of precomputed table of cryptographic hashes. It is a practical example of space-time tradeoff which means we are trading computational time at the cost of more storage.

2. Computational Problem  Brute force and other techniques take time because of the large search space that they explore to crack the password. A brute force attack consists of an attack program trying all possible combinations of a character set over a fixed length. The other alternative is a dictionary based attack. A dictionary based attack uses a wordlist that contains words, phrases, common passwords and other strings that can be used as a password. The password is cracked after comparing every word in the wordlist (after hashing) to the password hash. Although dictionary attacks have a fairly high speed, they are ineffective against passwords that are not based on a dictionary word. These techniques are computationally expensive and take exponential time as the search space increases according to the complexity of character set.

3

Page 4: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Rainbow Table is a precomputed table for reversing cryptographic hash functions. A single chain in a rainbow table is represented by a plain text (which represents the start of the index) and another plain text (which represents the end of the index). The main computational problem is the generation of these rainbow chains by performing multiple rounds of reduction and then cracking the passwords by performing a lookup on the rainbow table of the hash to be cracked. If the hash to be cracked is found in any of the index, the chain is regenerated and the plaintext of the hash to be cracked is found with the chain to be cracked. The main elements of this program like chain generation and hash cracking require computational optimization to yield a significant improvement over the other techniques which is the main challenge with this approach.

4

Page 5: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

3. Analysis of Research Paper - 1 Title - An improved parallel implementation of Rainbow Crack using MPI [1] This paper looked for bottlenecks in a password cracking program which can be improved upon with a parallel implementation. Specifically, the biggest bottleneck was found to be the precomputation part of the program i.e Rainbow Table generation. This process took the longest time and was worth improving upon. A parallel implementation yielded an implementation in which the computation of individual chains (which is the outermost loop) could be parallelized since this computation did not have any sequential dependency. Another sub-part of this bottleneck was the part where the Rainbow table is to be written to a file. Traditionally, the first value of the chain was written to a file as soon as it was generated, the transformations were calculated and the final value of the chain was written to the disk. This approach was changed by the authors by writing both the first and last value in one operation with 3 models as the result. 1- Multiple processors, each process -> separate files (using native file library). 2- Multiple processors, each process -> separate files (using MPI file library functions). 3- Multiple processors, one file. It was concluded that using the first model was the choice to go as when the second model was used it performed poorly (probably due to MPI_File_write() library function)The usage of this in our project is that we can parallelize individual chain generation since it can be computed independently. Also, chain writing can be handled by using per thread reduction variable which will yield better performance due to eliminating communication overhead.    

5

Page 6: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

4. Analysis of Research Paper - 2 Title - Rainbow Table optimization for Password recovery [2]

This paper proposed a new approach of achieving storage conservation as compared to the initial approach suggested for Rainbow-Tables. The earlier approach and this approach both use the same number of reduction functions but the approach suggested in given paper proposes one - to - many mappings in hash-reduction chains. The approach computes the hash for a particular starting plain text and then takes into account decimal representation for the same (‘H’). Then to compute (H+1) mod 2 j upto (H+K) mod 2j, where j is the number of bits in particular hash ( Eg. j = 128 for MD5 ). Then we calculate normal hash-reduce chains starting from this above-obtained branches. All the set of plaintexts obtained will be represented by a single plaintext.

We can assign different size chunks of starting plaintexts to different threads in Parallel - For As there is no sequential dependency in parallel chains

The technique for chain generation suggested will reduce the probability of collision and subsequent merging of chains 

5. Analysis of Research Paper - 3  Title - Heterogeneous Rainbow Table Widths Provide Faster Cryptanalysis [3] The rainbow-tables are not well utilized if all tables have the same width of chains i.e. equal number of chains with an equal number of columns. Such consideration can introduce performance problems if the sequential search is applied. Heterogenous Table - Use of new table structures compared to traditional table structures.Heterogenous Table structure permits to have multiple rainbow tables of different lengths.Heterogeneous tables are 40% faster than homogeneous tables. Incase of rainbow table with heterogeneous length, traditional sequential table search approach is not optimal for performance compared to performing parallel search on such tables.As our algorithm, is searching the tables parallely it can be benefit from the heterogeneous table structure.   

6

Page 7: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

6. Design and Operation of Sequential Program 

 

For developing a sequential algorithm for cracking passwords using Rainbow tables, major steps in program’s design are as follows (which are also common to Parallel design but each step is executed in a multi-threaded way as contrast Sequential design):

1. ● Generate the Rainbow tables ● Identify the best size for rainbow tables ( general guideline which can be

suggested as part of our research - cover double the possible input space).

● To assign random plain text for the possible passwords at the start of each row in the rainbow tables. ( observation through research - try to generate random plain text using normal distribution i.e. gaussian distribution to expect more possible passwords to be covered in the Rainbow table).

● Selecting a hash function as an essential component to generate individual chains - In our project we have used - “SHA-256”.

7

Page 8: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

● Designing a good reduction functions R1 to RTABLE_WIDTH which will be used to convert the hash output of “SHA-256” to again to a plaintext => in possible password domain.

● In Sequential program, RainbowChainGenerator object will be used to generate each chains, by taking index within rainbow table as input

● Once these factors are decided then we can proceed to Rainbow Table generation; a valuable insight for developing parallel design later is that - there is no sequential dependency for generation of an individual chain.

2.

● Generate SHA-256 hashes of randomly generated 7 digit integers according to distribution supported by ‘Java.util.Random‘ the output of SHA-256 message digest is an byte array of size 32.

● To use ‘ edu.rit.util.Hex ’ to convert the byte array output to a Hex String representation. Store them in a list <toCrack>.

3.

● Iterate over toCrack list of pregenerated hashes and crack each password Sequentially, by traversing the toCrack list

● The password cracking functionality is provided by PCracker class, which provides a method to take one hash as an input and return the plaintext pass word if found.

8

Page 9: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Following is a high-level design of our sequential program: Generate N hashes to crack and store in a Linked –

‘toCrack’

Create 1 Object of rainbowChainGenerator

Create 1 Object for PCracker

Initiate rainbowTable[ I in Heights][0] = Randomly

generated 7 digit number

For( 0 to rainbowTable[Height] )

For( i in Height) :

rainbowTable[i][1] =

rainbowChainGenerator.generateChain(rainbowTable[i][0

])

For( Hash h in toCrack)

Pass ith hash to be cracked by RainbowHashCracker

password = RainbowHashCraker.crack( h)

If (password != empty)

Print( h -> password)

Else

Print(h -> “”)

Algorithm for individual chain generation in

rainbowTable:

chainGenerator(String start) :

Start Point = start

for ( roundNo. 1 to chain width -1) {

hash = Sha256 of (Start Point);

reduced = (hash, roundNo);

Start Point = reduced;

}

9

Page 10: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Return lastly calculated String as ending point of that

chain

Algorithm for cracking the hash given precomputed hash

table

Crack(String HexRepresentation of SHA-256) :

Convert given HexRepresentation of SHA-256 to a byte

array of size 32

Using edu.rit.util.Hex.toByteArray()

password = “”;

from the last round reduction function, i.e. from (i =

widthRainbowTable) to 1

{

For (j = i+1 to chainLength AND not cracked)

At each such round complete the hash ->

reduction -> hash -> reduction

Pattern exactly with again with the same way in

which we calculated

The chains while generating the rainbow Tables

For ( 0 to rainbowTableHeight AND not cracked

){

If (rainbowtable[i][chain end] == step1

Output){

Regenerate that chain

While regenerating that chain if (hash of

a plaintext == given hash){

Password = plaintext

Cracked = true

}

}

}

Return password

}

10

Page 11: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

If password returned is “” then the password is not

cracked else, the we print the

Hashes to plaintext mappings

7. Design and Operation of Parallel Program 

Task flow is mainly analogous to the sequential task flow but, each task is parallelized

11

Page 12: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

In the parallel design as follows: Description: 1] As each chain in rainbowtable can be developed independent from other chains in The rainbowTable at different index, we designed a parallel for loop of pj2 such that Parallel thread team will work on some chunks of indexes within the height of the Rainbow table supervised by dynamic schedule. 2] A parallel overlapping pattern based on parallel work queue of pj2 is implemented; A thread will parallelly generate and put work items (Hex representation of SHA-256 Hashes of randomly generated 7 digit numeric passwords) At the same time due to parallel overlapping, another team of threads working through A parallel for loop will take an work item from the queue and try to crack that Password and will print it on console.

Declare a parallel work queue to process hashes parallelly

Declare a ‘volatile’ List to store cracked password parallelly

Initialize starting chain random plain texts

ParallelFor ( i = 0 to RainbowTable[height]-1 )

start() { Create a separate RainbowGenerator Object }

run(each thread executing some chunk i to RainbowTable[height])

rainbowTable[i][1] = generateChain(rainbowTable[i][0])

parallelDo{

New Section { run ( generate and put each hash in Parallel

WorkQue )

},

New Section { parallelfor (ParallelWorkQue)

start() { new PasswordCracker object for each thread }

run() { Crack password from taking item from queue}

providing inherent dynamic scheduling

}

12

Page 13: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

8. Developer’s Manual ( Compile )  1. Download Java JDK version 1.7 or higher to compile the source code. 2. Download Parallel Java 2 Library from https://www.cs.rit.edu/~ark/pj2.shtml 3. Set the classpath to pj2 library by using command:

export CLASSPATH=.:<PATH_TO_PJ2_JAR>

4. Change the directory to folder titled “source” from the main folder titled “ThreadRippers”.

5. Compile all the files using Java compiler:

javac *.java

6. This step is to execute the program in cluster only: create a jar file with the generated classes. To create the jar file, execute:

jar cf proj.jar *.class

7. Use the commands from user’s manual to run the program.

13

Page 14: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

9. User’s Manual   1. After performing the above steps to run program RainHashCrackerSmp (for example). Execute the program using the following the generic command java pj2 jar=<jar> cores= <K> RainbowHashCrackerSmp <N>

< K > = no. of cores

< N > = no. of passwords to be cracked.

Example of running a Sequential version <Seq> with console output written to file

java pj2 debug=makespan jar=proj.jar RainbowHashCrackerSeq 50 >

seq_output.txt

Example of running a Parallel version <Smp> with console output written to file java pj2 cores=12 jar=proj.jar RainbowHashCrackerSmp 100 >

smp_output.txt

 

 

 

   14

Page 15: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

 

10. Strong Scaling Performance Data  

   

    

   

15

Page 16: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Values for Strong Scaling Problem sizes ranging from N = 1000 to N = 5000 , tested on 1,2,3,4,5,6,9,10,12 cores for strong 

scaling. 

 

16

Page 17: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

  Possible Reasoning of non-ideal strong scaling 1. Each hash takes very largely varying computation time, for example a certain hash can lie in first chain hence can be found in O(20 (length of each chain)) whereas a certain hash can require searching in the whole table with complexity of O(9000000 * 20 (total area of the hash table)); as we are generating the hashes randomly this adheres to above described problem.

2. The reduction function which plays very important part in performance and calculations also require differing computation time as it contains operation of current round with moderately big primes and the SHA-256 hash value of the password, when we calculated time to perform reduction functions for different passwords that time varies greatly because each computation intakes greatly varying hash values.

3. Because our rainbow tables are not optimized fully as compared to professional industry level Rainbow Tables; after lots of trials we were able to reduce the chain collision and redundant chain generation by a very low amount ( around 32% of Collison measured). In pith the low optimization of rainbow table has contributed to redundant computation and hence resulted in hampering strong scaling further

4. Use of hash tables instead of array could be very efficient to use, such that endpoints of each chain can represent key values and the index at which that chain starts will be represented as values, so to instead of comparing the given hash with each of the chain’s endpoint in a sequential manner each lookup would have required somewhat constant amount of lookup time, but again due to chain collisions thousands of chains in our rainbow table containing possible different password values, representing all of them with a single key means to discard all the values from the previous chains ending with same plain text, this inability to use hash table as efficient data structure (as our rainbow tables are not clean) can also be a potential cause of low

strong scaling efficiency.

17

Page 18: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

11. Weak Scaling Performance Data  

Plots for weak scaling data on 5 problem sizes 

   

   

   

18

Page 19: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Values for weak scaling Problem sizes ranging from N = 100 to N = 500 for 1 core and incrementing according to cores. 

 

   

19

Page 20: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

Possible reasons for non ideal weak scaling 

1. The main contributing reason for nonideal weak scaling can be again addressed to vastly non uniformly distributed computational needs of different passwords, dependent on characteristics like SHA-256 hash value, round number along with vastly differing worst-case complexities of in rainbow Table searches (from O(1) to O (9000000*2)) 2. Again the low optimized rainbow table has presented lots of redundant computations, and using a more sophisticated rainbow Table which is cleaned and optimized, as well as use of hash tables (which is not currently possible for us due to large chaining collisions) can improve weak scaling performance.

12. Possible Future Work 

-Research on parameters of Rainbow table [height and width] to find the optimal trade off between the accuracy (number of passwords cracked correctly) and efficiency (scaling performance) of the program. With increased size of the rainbow table, higher efficiency is obtained but at the cost of efficiency of the program. - Our current approach was implemented to crack 7 digit numeric codes. The next step would be to incorporate passwords of different character sets and lengths to enhance the space that this program can crack.

- We can design a cluster parallel program so that each password will be cracked on different nodes and on each node we can parallelize the inner k loop and crack an individual password also parallely

20

Page 21: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

13. Learnings The main takeaways from this project were that we learnt how Rainbow tables work for password cracking, their structure generation as well as cracking part. We also learnt that if different parts of the same data structure (in our case rainbow tables) can be generated independently of each other, a parallelFor loop is efficient choice of design. In cases when sequential data generation or sequential data reading from file is required, best practice is to to implement overlapping pattern. How a parallel work queue can be used as a bridge in parallel overlap when processing and file reading or input generation can be done parallely. Although not very successfully but we learned how to perform different rounds of reduction function to generate Rainbow tables as well as for cracking the password phase.

14. Individual Contribution  Indrajeet: Core framework and design of sequential implementation and parallel implementation. Omkar: Object oriented design and implementation and weak and strong scaling performance data. Both contributed to the presentations and the final report. We split the different sections that needed to be done and combined them towards the end. Overall, we feel that the workload was quite evenly distributed among us throughout the research investigation.

21

Page 22: CSCI 654 - Foundations of Parallel Programming Team ...ark/654/team/4/report.pdf · approach would be to use a cracking technique which employs Rainbow Tables. Rainbow Table is a

15. References [1] - Edward R. Sykes, Wesley Skoczen, “An improved parallel implementation of Rainbow Crack using MPI” - https://www.sciencedirect.com/science/article/pii/S1877750313001233 

 [2] - Vrizlynn L.L. Thing, Hwei-Ming Ying, “Rainbow Table optimization for Password recovery” - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.674.1355&rep=rep1&type=pdf   [3] - Gildas Avoine, Xavier Carpent,. “Heterogeneous Rainbow Table Widths Provide Faster Cryptanalyses” - http://dx.doi.org/10.1145/3052973.3053030   [4] - Philippe Oechslin. “Making a Faster Cryptanalytic Time-Memory Trade-Off” -http://dx.doi.org/10.1145/3052973.3053030    

22