ShubhamMore4 / Project-Resume_Classification

Resume classification is the task that automatically categorizes resumes or CVs into predefined domain categories or classes based on their content. This task is essential for the job recruitment process, particularly when organizations receive a large number of applications for various positions.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project-Resume_Classification

Business Objective:

• The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention.

• Resume classification is the task that automatically categorizes resumes or CVs into predefined domain categories or classes based on their content. This task is essential for the job recruitment process, particularly when organizations receive a large number of applications for various positions.

• Resumes are an ideal example of unstructured data. Since there is no widely accepted resume layout, each resume may have its own style of formatting, different text blocks and different category titles. Building a resume classification and gathering text from it is no easy task as there are so many kinds of layouts of resumes that you could imagine, following layouts are been given below for your better understanding.

Text Cleaning:

• Text cleaning, also known as text preprocessing, involves transforming raw text data into a clean and consistent format for further analysis. It typically includes tasks such as lowercasing, tokenization, stop word removal, punctuation removal, spell correction, lemmatization or stemming, and removing special characters or numerical values. The goal is to standardize the text and remove unnecessary elements that do not contribute to the analysis

NLP Pipeline:

• The NLP resume classification pipeline involves steps such as data acquisition, data preprocessing, text tokenization, stop word removal, text normalization, feature extraction, model training, model evaluation, model deployment, and iterative improvement. This pipeline leverages NLP techniques to process and classify resumes based on their textual content, enabling automated and efficient resume screening and candidate selection.

Models Accuracy:

• Random Forest classifier model selected for deployment .

Deployment:

• Depolyment using Streamlit

About

Resume classification is the task that automatically categorizes resumes or CVs into predefined domain categories or classes based on their content. This task is essential for the job recruitment process, particularly when organizations receive a large number of applications for various positions.


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%