recommendation_system

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems using the Booking Crossing Dataset which is also included here in this repo.

See the project's details here

How to Use this repo

Clone this repo:

$ git clone https://github.com/ohjho/recommendation_system.git
$ cd recommendation_system

install the requirements. We highly recommend doing this inside a virtualenv and avoid dependency hell.

#---------------- optional ------------------
$ mkvirtualenv --python=`which python3` NameOfYourEnv
$ workon NameOfYourEnv
#--------------------------------------------

(NameOfYourEnv) $ pip install -r requirements.txt

and just check and resolve any packages dependency issues if they show up under pip check. It should say No broken requirements found.

Start Jupyter notebook

$ jupyter notebook

Data Cleaning

How to use data_cleaning.py

The script data_cleaning.py will import the datasets and clean the data.

To get 3 separate dataframes, do this

from data_cleaning import get_clean_data
df_books, df_users, df_ratings = get_clean_data()

And if the csv files are not under data/, use the path argument.

To get one merged dataframe, do this:

from data_cleaning import get_merged_data_frame
df_merged = get_merged_data_frame(user_argv=user_threshold, isbn_argv=book_threshold)

where user_threshold is the threshold to filter out users with fewer than this number of books rated. books_threshold is the books counterpart And if the csv files are not under "/data/", use the path argument.

Presentation

Google Slides

About

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems

MIT License

Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%

ohjho / recommendation_system

recommendation_system

How to Use this repo

Data Cleaning

How to use data_cleaning.py

Modeling

A. Content-based Filtering

B. Collaborative Filtering

C. Latent Factor Analysis

Presentation

About

Languages