Aadya178 / UNICEF-Global-School-Dropouts

Determining the total dropout rate of pupils at different levels of education based on various factors globally.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UNICEF Global School Dropouts

forthebadge

UNICEF says, on any given school day, over 1 billion children around the world head to class. More children and adolescents today are enrolled in pre-primary, primary and secondary education than ever before.

Yet an estimated 617 million children and adolescents around the world are unable to reach minimum proficiency levels in reading and mathematics, even though two thirds of them are in school. 11 per cent of primary-school-aged children and 20 per cent of lower-secondary-aged children are not in school at all!

Overview

Contents

Problem Statement

πŸ€”

UNICEF has collected the pupil dropout rate at different educational levels (primary, lower secondary and upper secondary) through various sources over the course of 10 years. This dataset contains dropout rates at different educational levels based on several parameters (gender, living conditions and financial conditions) in different regions around the globe. It finally mentions the total pupil dropout rate in these regions. The task is to analyze the dataset attributes and thus understand the dropout trends globally.

Objective

🎯

The aim is to build a model that shows the significance of each factor in predicting the total dropout rate. This can be further used to identify the key areas to work on to bring down the Total Dropout rate in different regions.

Dataset

πŸ“Š

The dataset contains the following attributes:

  • Region
  • UNICEF Sub-region 1
  • Development Regions
  • Female Dropout Rate
  • Male Dropout Rate
  • Rural Dropout Rate
  • Urban Dropout Rate
  • Poorest Dropout Rate
  • Poor Dropout Rate
  • Middle Segment Dropout Rate
  • Rich Dropout Rate
  • Richest Dropout Rate
  • Data Source
  • Time period
  • Educational level
  • Total Dropout Rate

Process Flow

🎞️

The dataset has been analyzed in the following manner.

Libraries Used

πŸ“š

numpy adds support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits.

scipy is used for scientific computing and technical computing. It contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks.

pandas is for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

sklearn or scikit-learn features various classification, regression and clustering algorithms and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Data Analysis and Visualization

πŸ‘€

The data has been analyzed using Jupyter Notebook.

Model Used

πŸ€–

The target variable is Total Dropout Rate.

Since the target variable's range can be any value from 0-100% hence it is a Regression problem.

Thus, the model has been trained using Multiple Linear Regression algorithm.

Model Evaluation

πŸ’―

The accuracy of the output prediction is greater than 99%.

The model's accuracy is being evaluated using K-fold cross-validation.

The evaluation metrics include RMSE(Root Mean Square Error) and R2 score.

Result

🏁

The most important predictors for the Total Dropout Rate of pupils in different regions globally are:

  • Female dropout rate

  • Male dropout rate

  • Rural dropout rate

  • Urban dropout rate

  • Poorest dropout rate

  • Poor dropout rate

  • Middle Segment dropout rate

  • Rich dropout rate

  • Richest dropout rate

While the other factors:

  • Region

  • UNICEF Sub-region 1

  • Development Regions

  • Data Source

  • Time Period

  • Educational Level don't have a significant impact on the Total dropout rate.

The analysis shows that the students' gender, living conditions and financial wellness are crucial to determine the total dropout rates and are hence the key areas for development. While the global regions and the educational level along with other factors are not very important.

Thus, the government authorities and related organizations should strive to educate children from different backgrounds alike.

Since the model has an extremely high accuracy rate of more than 99%, it can be used to predict the total dropout rates in the future too and thus assist in bringing them down.

πŸ‘‹πŸ» Hi! Thanks for stopping by. Give it a ⭐ if you like it!

About

Determining the total dropout rate of pupils at different levels of education based on various factors globally.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Jupyter Notebook 100.0%