corpus count-vectorizer label-encoding lemmitization machine-learning nltk part-of-speech-tagging resume-classification spacy stemming text-mining text-preprocessing textract tfidf-vectorizer tokenization wordcloud

Project-Resume-Classification

PPT Presentation:

View presentation

Business Problem:

The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human.

Objective:

The objective of a resume classification project typically involves automating the process of categorizing resumes into different job roles, industries, or skill sets. This can streamline the hiring process for recruiters and HR departments by enabling them to quickly identify the most relevant resumes from a large pool of applicants.

Abstract:

A resume is a brief summary of your skills and experience. Companies recruiters and HR teams have a tough time scanning thousands of qualified resumes. Spending too many labor hours segregating candidates resume's manually is a waste of a company's time, money, and productivity. Recruiters, therefore, use resume classification in order to streamline the resume and applicant screening process. NLP technology allows recruiters to electronically gather, store, and organize large quantities of resumes. Once acquired, the resume data can be easily searched through and analyzed.

Resumes are an ideal example of unstructured data. Since there is no widely accepted resume layout, each resume may have its own style of formatting, different text blocks and different category titles. Building a resume classification and gathering text from it is no easy task as there are so many kinds of layouts of resumes that you could imagine.

Architecture

🔹The basic data analysis process performed such as data collection, text mining, data cleaning, exploratory data analysis, data visualization.

🔹Building a Machine learning model for Resume Classification using Python and basic Natural language processing techniques.

🔹Used Python's libraries to implement various NLP techniques like tokenization, lemmatization, parts of speech tagging, etc.

🔹A resume classification analyzes resume data and extracts the information into the machine-readable output. It helps automatically store, organize, and analyze the resume data to find out the candidate for the particular job position and requirements.

🔹The aim of this project is achieved by performing the various data analysis methods and using the Machine Learning models and Natural Language Processing which will help in classifying the categories of the resume and building the Resume Classification Model.

Different ML Algorithm used:

Logistic Regression
DecisionTree Classifier
KNN Classifier
SVM Classifier
NaiveBayes Classifier
RandomForest Classifier
Bagging Classifier
AdaBoost Classifier
Gradient Boosting Classifier
Voting Classifier

Final Model:

Model Selected: Random Forest Classifier

Random Forest Classifier : RFC works by creating multiple decision trees during the training phase and outputting the mode of the classes (classification) or mean prediction (regression) of the individual trees.

Deployment :

For Deployment Streamlit App is Used

Thank You For Visiting.....

About

The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention.

corpus count-vectorizer label-encoding lemmitization machine-learning nltk part-of-speech-tagging resume-classification spacy stemming text-mining text-preprocessing textract tfidf-vectorizer tokenization wordcloud

Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%