louisdubaere / adv-python-ind

Repository for the individual assignment for Advanced Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Individual Python Assignment using Dask

This project predicts the amount of shared bikes that get used in Washington D.C. per hour. It makes use of Dask. The in-depth EDA got left out, because this was done in another assignment already and this task was more focused on the use of Dask.

Getting Started

This section goes over the packages you need to install in order to run the code.

Prerequisites

The Python Notebook makes use of 3 different packages namely:

  • Dask: A package that provides advanced parallelism for analytics, enabling performance at scale.
  • Dask ML: A package that provides scalable machine learning in Python using Dask.
  • Scikit Learn: A package for Machine Learning in Python, that provides simple and efficient tools for data mining and data analysis

It is required that these packages are installed in the environment where the Python Notebook is ran.

This can be done using Anaconda-Navigator or by running following commands in the terminal

conda install dask
conda install dask-ml
conda install scikit-learn

Running the notebook

The notebook can be opened and ran with Jupyter Notebook in an environment with the required packages installed.

The Data

The data can be found on the UCI Machine Learning Repository website.

About

Repository for the individual assignment for Advanced Python


Languages

Language:Jupyter Notebook 100.0%