ShaddAhmed14 / PhD-students-Data-Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The number of PhD student in Germany Analysis using Python

RSE Summer 2024 Self-Project # 1.

The Project will tackle the following questions regarding the number of PhD students in Germany.

  • How do the number of students change over the last 4 years?
  • How does this change based on nationality, gender and type of courses?

Index

Problem Description?

Being a Masters in Data Science student myself, I was interested in the statistics of higher education students. Analysing this data will provide us with several insighs which we can use to dive deeper and research questions such as:

  • Which courses do foreigners usually opt for and how to market them better.
  • Which courses are male/female dominant.
  • Why are some courses more popular than the others? Do we need better marketing? or is it a language barrier issue?

How to get started and Requirements

Download the repository and have ipython as well as matplotlib and pandas installed. Test it using Python 3.11.3 - I do not give gurantees that it works with older versions.

Running the Program

Copy the ipyton/ Jupyter notebook and open on your preferable editor. Run the notebook to view results.

Program files are stored in the src folder. Further documentation can be found in the description.txt

Dataset

The Dataset is from Genesis, a statistical data service provided by the German government. Dataset used is GENESIS-Tabelle: 21352-0003, Statistics of doctoral students. I have created 2 csv for this: The first has course groups only while the 2nd has all the courses.

  • Data is recorded from 2019 - 2022.
  • Main header has Year, Course Group/ Course, German, Foreigner and Total Columns. Where German, Foreigner and Total refer to the number of students.
  • 2nd header has Male and Female column for the German, Foreigner and Total Columns.

About

License:MIT License


Languages

Language:Jupyter Notebook 100.0%