![Page 1: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/1.jpg)
RapidMiner World2014
Similarity Assessment and
Resume Analysis using Clustering
and Cosine Similarity Measures in
RapidMiner
Surabhi Lodha
Santosh Vishwakarma
![Page 2: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/2.jpg)
PROBLEM STATEMENT
• Every company’s main challenge is hiring of new individuals
• For recruitment the pool of resume a company gets for a job application is way larger than the number of people assigned to analyze it.
![Page 3: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/3.jpg)
SOLUTION
• NEED OF TEXT MINING MODEL
• SORTING AND FILTERING OF KEYWORDS
• CATEGORISING OF RESUMES FOR BETTER PROCESSING
![Page 4: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/4.jpg)
WHY RAPID MINER
• Rapidminer is an open source software package for predictive analysis.
• It is solid and complete package with flexible and affordable support options.
• Enterprise-ready performance and scalability for big data analytics Innovative analyst support.
• We can program by piping components together in a graphic ETL workflows.
• Rapidminer is very powerful due to its learning operators and operator framework, which allows to form nearly arbitrary processes
![Page 5: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/5.jpg)
DATASET
• RESUMES OF GRADUATE STUDENTS OF VARIOUS STREAMS
– CSE 300
– CIVIL ENGG 225
– ELECTRICAL 200
– MECHANICAL 250
![Page 6: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/6.jpg)
OUR APPROACH
PREPROCESSING OF RESUME DATASET
• TOKENISING
• STEMMING
• REMOVAL OF STOP WORDS
• INVERTED INDEX
PERFORM CLUSTERING USING K-MEANS
![Page 7: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/7.jpg)
![Page 8: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/8.jpg)
RESULT ANALYSIS
COMPARISONS BETWEEN CLUSTERS
![Page 9: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/9.jpg)
COMPARISONS AMONG CLUSTERS
![Page 10: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/10.jpg)
DATA SIMILARITY BW RESUMES
![Page 11: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/11.jpg)
CONCLUSIONS
• Reduces the work of HR
• Project focuses on resume analysis by implementing clustering algorithm on resume dataset using rapid miner tool
• Selection of best resume in minimum time
![Page 12: RM World 2014: Similarity assessment and resume analysis](https://reader034.vdocument.in/reader034/viewer/2022050707/54814947b37959442b8b5dd4/html5/thumbnails/12.jpg)
THANKS