erecruiter expert system presenters: date. agenda review (wei 2 mins) – problem domain –...
TRANSCRIPT
eRecruiter Expert System
Presenters:Date
Agenda
• Review (Wei 2 mins)– Problem domain– Overview of the system
• Milestones (Jon S. 2 mins)– Timeboxes– Deliverables
• Meetings with experts (Max or/and Jon M. 2 mins)– With Steve Saunder
• Nuts and Bolts (all 8 mins)– Work division– Implementation of each part of the system
• Demo and discussion (Jon S. 6 mins)
Introduction and Overview
eRecruiter• Problem domain:
– eRecruiter is an expert system that help judge a resume according to the knowledge extracted from a human expert.
• As an expert system:– Facts from resumes.– Templates to define the structure of facts and knowledge.– Inference rules for scoring and weighting facts and making decisions.– Explanation for explaining the results of judgments.
• Use cases of the system:– Quickly create a pool of qualified resumes.– Rank resumes.– Judge an individual resume.
System design: components
Facts generation
Run CLIPS
Explanation1
2
3
Step 3-1 Generate facts
wxPython and Python
Beautifulsoup, NLTK and Python
Step 3-2 Run CLIPS
Python and PyCLIPS
Step 3-3 Explanation
Python and wxPython
Milestones
• Jon S. part goes from here
Meetings with experts
• Max and Jon M. part goes here
Work divisions (pls edit based on your needs:))
• Individual accomplishment:– Max and Jon M:– Jon S.:– Wei: resume formatting, resume parsing, resume CLIPS facts
generation. • Shared accomplishments:
– Discussion on the overall design of the system.– Preparation of knowledge base.– Discussion on facts structure and inference rules.– Discussion on scoring strategy and explanation system.– Timebox, deliverables, expert contact and group meetings.
Bolts and Nuts Part 3-1
Resume parsing and facts generation
NLTK and Beautifulsoup
• NLTK (natural language toolkit) is used to extract resume facts based on linguistic patterns.– “(I) Worked on Ruby on Rails application creating
matching algorithms and UPC database.”– I/PRP worked/VBD on/IN Ruby/NNP on/IN Rails/JJ
application/NN creating/VBG matching/VBG algorithms/NNS and/CC UPC/NN database/NN ./.
• Beautifulsoup, a python library handling DOM objects.
HTML resume to CLIPS facts
HTML resume
Experience
Position Leadership quality
Experience description Work area quality
Duration Loyalty quality
Skills Skill qualities
Certifications Certification qualities
Education
DegreeDegree quality
School School rank quality
Major Major quality
DOM root
DOM objects
Text area and attributes of objects
HTML structure
……<div id="company1" title="ClearNet Security"> <div id="position11">Consultant</div> <div id="exp_time11">January 2010-April
2010</div> <div id="experience11">Worked on Ruby on Rails
application creating matching algorithms and UPC database.</div>
</div>……
Deftemplates for these facts are predefined.
Coding convention
• Resume facts CLIPS file is named uniquely as ID_Name.clp.
• Each deffacts has a ID slot to uniquely identify a candidate.