JahanAjani / wns-analytics-hackathon-2018

My solution to predict employee promotion, secured 333 rank on private leaderboard in final round.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Employee_promotion_Prediction_using_CatBoost

Employee_promotion_Prediction_using_CatBoost

alt-text

1. Business Problem

1.1. Description

Your client is a large MNC and they have 9 broad verticals across the organisation. One of the problem your client is facing is around identifying the right people for promotion (only for manager position and below) and prepare them in time.

Currently the process, they are following is: They first identify a set of employees based on recommendations/ past performance Selected employees go through the separate training and evaluation program for each vertical. These programs are based on the required skill of each vertical

At the end of the program, based on various factors such as training performance, KPI completion (only employees with KPIs completed greater than 60% are considered) etc., employee gets promotion For above mentioned process, the final promotions are only announced after the evaluation and this leads to delay in transition to their new roles. Hence, company needs your help in identifying the eligible candidates at a particular checkpoint so that they can expedite the entire promotion cycle.

They have provided multiple attributes around Employee's past and current performance along with demographics. Now, The task is to predict whether a potential promotee at checkpoint in the test set will be promoted or not after the evaluation process.

1.2. Source/Useful Links

https://datahack.analyticsvidhya.com/contest/wns-analytics-hackathon-2018/

2. Machine Learning Problem Formulation

2.1. Data

2.1.1. Data Overview

WNS Analytics Wizard 2018 Data hack competition from analyticsvidhya.com, same data set i am using

Training Data: it have 54808 records and 14 columns

Test Data: it have 23490 records and 13 columns

2.2. Mapping the real-world problem to an DL problem

2.2.1. Type of Deep Learning Problem

Binary Classification :

- Based on Employee's past and current performance along with demographics. Now, The task is to predict whether a potential promotee at checkpoint in the test set will be promoted or not after the evaluation process.:

2.2.2. Performance Metric

Metric(s): F1 score

data set have unbalanced data[9:1] so better to select F1 score instead of Accuracy

2.3. Train, CV and Test Datasets

Split the Training dataset into Two parts train, and cross validation with 70% and 30% of data respectively

3. Code

Main code file is

wns_predicitng_potential_employee.ipynb

4. Final submission file

The final submission file generated by wns_predicitng_potential_employee.ipynb with final score of 0.5066991474

About

My solution to predict employee promotion, secured 333 rank on private leaderboard in final round.

License:MIT License


Languages

Language:Jupyter Notebook 71.6%Language:Python 28.4%