This project predicts the amount of shared bikes that get used in Washington D.C. per hour. It makes use of Dask. The in-depth EDA got left out, because this was done in another assignment already and this task was more focused on the use of Dask.
This section goes over the packages you need to install in order to run the code.
The Python Notebook makes use of 3 different packages namely:
- Dask: A package that provides advanced parallelism for analytics, enabling performance at scale.
- Dask ML: A package that provides scalable machine learning in Python using Dask.
- Scikit Learn: A package for Machine Learning in Python, that provides simple and efficient tools for data mining and data analysis
It is required that these packages are installed in the environment where the Python Notebook is ran.
This can be done using Anaconda-Navigator or by running following commands in the terminal
conda install dask
conda install dask-ml
conda install scikit-learn
The notebook can be opened and ran with Jupyter Notebook in an environment with the required packages installed.
The data can be found on the UCI Machine Learning Repository website.