Naive-Resume-Matcher

A Machine Learning Based Resume Matcher, to compare Resumes with Job Descriptions. Create a score based on how good/similar a resume is to the particular Job Description.\n Documents are sorted based on Their TF-IDF Scores (Term Frequency-Inverse Document Frequency)

Check the live version here. The instance might sleep if not used in a long time, so in such cases drop me a mail or fork this repo and launch your own instance at Streamlit's Cloud Instance

Matching Algorihms used are :-

String Matching
- Monge Elkan
Token Based
- Jaccard
- Cosine
- Sorensen-Dice
- Overlap Coefficient

Topic Modelling of Resumes is done to provide additional information about the resumes and what clusters/topics, the belong to. For this :-

TF-IDF of resumes is done to improve the sentence similarities. As it helps reduce the redundant terms and brings out the important ones.
id2word, and doc2word algorithms are used on the Documents (from Gensim Library).
LDA (Latent Dirichlet Allocation) is done to extract the Topics from the Document set.(In this case Resumes)
Additional Plots are done to gain more insights about the document.

Images

List of Job Descriptions to Choose from.
Preview of your Chosen Job Description
Your Resumes are ranked now! Check the top Ones!!
Score distribution of different candidates incase you want to check some more.
Topic Disctribution of Various Resumes
Topic Distribution Sunburst Chart
Word Cloud of your resume for a quick glance!

Preview

Progress Flow

Input is Resumes and Job Description, the current code is capable to compare resumes to multiple job descriptions.
Job Description and Resumes are parsed with the help of Tesseract Library in python, and then is converted into two CSV files.Namely Resume_Data.csvandJob_Data.csv.
While doing the reading, the python script named fileReader.py reads, and cleans the code and does the TF-IDF based filtering as well. (This might take sometime to process, so please be patient while executing the script.)
For any further comparisons the prepared CSV files are used.
app.py containg the code for running the streamlit server and allowing to perform tasks. Use streamlit run app.py to execute the script.

File Structure

Data > Resumes and > JobDescription

The Data folder contains two folders that are used to read and provide data from. Incase of allowing the option to upload documents, Data\Resumes and Data\JobDesc should be the target for Resumes and Job Description respectively.

Due the flexibility of Textract we need not to provide the type of document it needs to scan, it does so automatically.

But for the Job Description it needs to be in Docx format, it can be changed as well.

Installation Instructions

A python virtual environment is required for this. Please read this page for more information.

A pip requirements.txt file is provided. It is advised to install the packages listed below, manually by doing pip install <package_name>. As the requirements.txt file may have some unecessary additional dependencies.

Popular Packages used are:-

Furthermore the packages like NLTK and Spacy requires additional data to be downloaded. After installing them please perform:-

## For Spacy's English Package
python -m spacy download en_core_web_sm

## For NLTK Data
import nltk
nltk.download('popular')  # this downloads the popular packages from NLTK_DATA

Execution Instructions

Please check the How To file for execution instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
Data		Data
Demo		Demo
Images		Images
Screenshots		Screenshots
.gitignore		.gitignore
Cleaner.py		Cleaner.py
Distill.py		Distill.py
Howtorunthis.md		Howtorunthis.md
Job_Data.csv		Job_Data.csv
LICENSE		LICENSE
README.md		README.md
Resume_Data.csv		Resume_Data.csv
Similar.py		Similar.py
app.py		app.py
fileReader.py		fileReader.py
install.py		install.py
requirements.txt		requirements.txt
tf_idf.py		tf_idf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive-Resume-Matcher

Images

Preview

Progress Flow

File Structure

Data > Resumes and > JobDescription

Installation Instructions

Execution Instructions

About

Releases

Packages

Languages

License

prabhucts/Naive-Resume-Matching-1

Folders and files

Latest commit

History

Repository files navigation

Naive-Resume-Matcher

Images

Preview

Progress Flow

File Structure

Data > Resumes and > JobDescription

Installation Instructions

Execution Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages