ohjho / recommendation_system

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

recommendation_system

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems using the Booking Crossing Dataset which is also included here in this repo.

See the project's details here

made-with-python python versions MIT license

How to Use this repo

  1. Clone this repo:
$ git clone https://github.com/ohjho/recommendation_system.git
$ cd recommendation_system
  1. install the requirements. We highly recommend doing this inside a virtualenv and avoid dependency hell.
#---------------- optional ------------------
$ mkvirtualenv --python=`which python3` NameOfYourEnv
$ workon NameOfYourEnv
#--------------------------------------------

(NameOfYourEnv) $ pip install -r requirements.txt

and just check and resolve any packages dependency issues if they show up under pip check. It should say No broken requirements found.

  1. Start Jupyter notebook
$ jupyter notebook

Data Cleaning

How to use data_cleaning.py

The script data_cleaning.py will import the datasets and clean the data.

To get 3 separate dataframes, do this

from data_cleaning import get_clean_data
df_books, df_users, df_ratings = get_clean_data()

And if the csv files are not under data/, use the path argument.

To get one merged dataframe, do this:

from data_cleaning import get_merged_data_frame
df_merged = get_merged_data_frame(user_argv=user_threshold, isbn_argv=book_threshold)

where user_threshold is the threshold to filter out users with fewer than this number of books rated. books_threshold is the books counterpart And if the csv files are not under "/data/", use the path argument.

Modeling

Presentation

Google Slides

About

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems

License:MIT License


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%