hechmik / foundations_of_cs

Final project for "Foundation of Computer Science" M.Sc. Data Science course

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Final Project - Foundations of Computer Science

Introduction

The goal of this project is to answer various questions related to the Kiva dataset. In particular I answered the following questions:

  1. Normalize the loan_lenders table. In the normalized table, each row must have one loan_id and one lender.
  2. For each loan, add a column duration corresponding to the number of days between the disburse time and the planned expiration time. If any of those two dates is missing, also the duration must be missing.
  3. Find the lenders that have funded at least twice.
  4. For each country, compute how many loans have involved that country as borrowers.
  5. For each country, compute the overall amount of money borrowed.
  6. Like the previous point, but expressed as a percentage of the overall amount lent.
  7. Like the three previous points, but split for each year (with respect to disburse time).
  8. For each lender, compute the overall amount of money lent.
  9. For each country, compute the difference between the overall amount of money lent and the overall amount of money borrowed
  10. Which country has the highest ratio between the difference computed at the previous point and the population?
  11. Which country has the highest ratio between the difference computed at point 9 and the population that is not below the poverty line?
  12. For each year, compute the total amount of loans: each loan that has planned expiration time and disburse time in different years must have its amount distributed proportionally to the number of days in each year.

About me

About

Final project for "Foundation of Computer Science" M.Sc. Data Science course


Languages

Language:Jupyter Notebook 100.0%