The Resume Matcher takes your resume and job descriptions as input, parses them using Python, and mimics the functionalities of an ATS, providing you with insights and suggestions to make your resume ATS-friendly.
The process is as follows:
-
Parsing: The system uses Python to parse both your resume and the provided job description, just like an ATS would. Parsing is critical as it transforms your documents into a format the system can readily analyze.
-
Keyword Extraction: The tool uses advanced machine learning algorithms to extract the most relevant keywords from the job description. These keywords represent the skills, qualifications, and experiences the employer seeks.
-
Key Terms Extraction: Beyond keyword extraction, the tool uses textacy to identify the main key terms or themes in the job description. This step helps in understanding the broader context of what the resume is about.
-
Vector Similarity Using Qdrant: The tool uses Qdrant, a highly efficient vector similarity search tool, to measure how closely your resume matches the job description. This process is done by representing your resume and job description as vectors in a high-dimensional space and calculating their cosine similarity. The more similar they are, the higher the likelihood that your resume will pass the ATS screening.
On top of that, there are various data visualizations that I've added to help you get started.
🧪 Please check the Landing Page. PRs are also welcomed over there.
- Clone the project.
- Create a python virtual environment.
- Activate the virtual environment.
- Do
pip install -r requirements.txt
to install all dependencies. - Put your resumes in PDF Format in the
Data/Resumes
folder. (Delete the existing contents) - Put your Job Descriptions in PDF Format in
Data/JobDescription
folder. (Delete the existing contents) - Run
python run_first.py
this will parse all the resumes to JSON. - Run
streamlit run streamlit_app.py
.
Note: For local versions don't run the streamlit_second.app it's for deploying to streamlit.
Note: The Vector Similarity Part is precomputed here. As sentence encoders require heavy GPU and Memory (RAM). I am working on a blog that will show how you can leverage that in a google colab environment for free.
Thanks for the support 💙 this is an ongoing project that I want to build with open source community. There are many ways in which this tool can be upgraded. This includes (not limited to):
- Create a better dashboard instead of Streamlit.
- Add more features like upploading of resumes and parsing.
- Add a docker image for easy usage.
- Contribute to better parsing algorithm.
- Contribute to on a blog to how to make this work.
- Contribute to the landing page maybe re-create in React/Vue/etc.