PranayMalhotra / Colleges-for-Maria

Data Cleaning - A project which takes all colleges in the US, and narrows down the suitable colleges by slicing, dicing and concatenating startup activity data and crime statistics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FINDING COLLEGES FOR MARIA

Introduction

This is a data wrangling and data cleaning project which uses three disparate datasets to filter out all the colleges based on certain conditions.

Problem Statement

Maria, a 25-year old US Army veteran, has just returned to the civilian workforce and is looking for suitable colleges to pursue a degree in Management Systems and Information Technology.

The task is to create a dataset of colleges, with rankings, subject to the following criteria: The colleges must be in:

  • be in an urban/metropolitan area.
  • be in a city that ranks 75th percentile or higher on Kauffman's start-up rankings.
  • be below 50th percentile in overall crime.
  • offer a 2-year or 4-year degree in Information Technology/Science.

Data and other Files

  • Kauffman's start-up rankings are present in a CSV file called Startup Activity.csv
  • All other data can be found here.
  • Files named LocalCrimeTrends.pdf and FullDataDocumentation.pdf explains data in the crime and college datasets.

Requirements

The project was done in Jupyter Notebook, Python 3.

About

Data Cleaning - A project which takes all colleges in the US, and narrows down the suitable colleges by slicing, dicing and concatenating startup activity data and crime statistics.


Languages

Language:Jupyter Notebook 100.0%