junzhin / Mast30034_2021_s2_project_1-junzhin

First Project on ADS at UOM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MAST30034 Project 1 - Quantitative Analysis

Dependencies

  • Language: _i.e Python 3.8.3

  • Packages / Libraries:

    • pandas
    • sklearn
    • seaborn
    • folium,
    • numpy,
    • math,
    • geopandas
    • bokeh
    • matplotlib

Datasets

Some datasets are not used in the report.

Directory

  • raw_data: orginal datasets for yellow taxi trip data only.
  • data: Contain all preprocessed files and small external datasets supporting the analysis.
  • plots: All plots and Map.html are saved here, both for data exploration and reporting writting.
  • code:
    • Notebook 0 for "Download_data.ipynb".
    • Notebook 1 for "1. Preprocessing_2020_whole_year.ipynb".
    • Notebook 2 for "1. Preprocessing-2019.ipynb".
    • Notebook 3 for "1. Preprocessing-2020.ipynb".
    • Notebook 4 for "2. Visual and Exploratory analysis part1.ipynb".
    • Notebook 5 for "2.Visual and Exploratory analysis part2.ipynb".
    • Notebook 6 for "3. Statistical Modelling.ipynb".

Other

  • Run notebooks in the listed order above and ensure you have run the notebook 0 to download the raw data beforehand.
  • Make sure to change the filepath to your local machine if you intend to run the codes above under a different environment.
  • Changing filepath is just to chang first disk name,and the rest are the same.
  • To sucessfully run the notebooks,you must change all filepaths in notebooks
  • Some plots are not saved by notebook auto-generation and they are saved through manual screenshot, but they are all there, if you are in doubt, check them out!

About

First Project on ADS at UOM

License:MIT License


Languages

Language:Jupyter Notebook 58.3%Language:HTML 41.7%