Skip to content

sani-1023/CV_analysis_ML

Repository files navigation

CV_analysis_ML

Introduction

We build a CV analysis system for a particular organization to ease their recruitment.

  • An organization will have a dataset of requirement entities where our Machine Learning model will be trained.
  • There will be a user interface where a CV is provided for evaluation.
  • The Interface takes a CV in PDF format and predicts which position the candidate is suitable for.
  • overall this system is built from a recruiter’s perspective.

Project Description

Our project includes two portions namely:

  • The Python implementation of the system in Google Colab.
  • The user Interface was developed with Streamlit.

Dataset

We used “UpdatedResumeDataSet.csv” from Kaggle and it has two features such as Category (Job positions) & Resume.

Data Cleaning & Vectorization & lebel encoding

  • The dataset was cleaned using the Python Regex module.The cleaned data was added to the 'clean text' feature.
  • The category of job positions were encoded to numericals by the lebel encoder module.
  • 'Clean text' Features of the dataset were vectorized using TfidfVectorizer with a maximum of 2000 features.

Model Selection & training & testing

  • We used KNeighborsClassifier as OneVsrestClassifier for training 769 data items and tested on 193 data samples.
  • We got an accuracy of 98% in this approach so we decided to use this model to take a single CV as an input and predict which job position the candidate is suitable for.

Integration with frontend

  • We integrated Streamlit with Colab using Pyngrok.
  • The frontend has an input field where a CV is taken as an input PDF file and processed through the system to generate results.

Some Screenshots:

Developed by,

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages