junolee / joke-recommender

Built recommendation system based on a dataset of over 4 million ratings of jokes using GraphLab

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Joke Recommender System

Recommendation system built based on a dataset of over 4 million ratings of jokes. Implemented an item similarity recommender and factorization recommenders using GraphLab.

  • Task: Build a recommendation system for jokes
  • Data: User ratings ranging from -10 to 10
  • Scoring: Mean rating for top 5% of jokes predicted by recommender
  • Tools: GraphLab, Pandas

Exploratory Data Analysis

Distribution of Ratings, Average Joke Rating, Number of Ratings per Joke: eda.png

Choosing a Model

  • If you care about accurately predicting the rating a user would give a specific item - Factorization recommender
  • If you care about ranking performance, instead of simply predicting the rating accurately - Item Similarity Recommender or Ranking Factorization recommender

More details here.

Tuning Parameters for Factorization Recommenders

Number of Latent Features:

  • Factorization Recommender:
    • Default: 8
    • Optimal: 2
  • Ranking Factorization Recommender:
    • Default: 32
    • Optimal: 8

Regularization: Default

Tuning Parameters

Model Results

  • Factorization Recommender (num_factors=2)
    • Score: 2.5001
  • Ranking Factorization Recommender (num_factors=8)
    • Score: 2.1726
  • Item Similarity Recommender (similarity_type=pearson)
    • Score: 2.5301
  • Item Similarity Recommender (similarity_type=cosine)
    • Score: 2.3010
  • Item Similarity Recommender (similarity_type=jaccard)
    • Score: 1.5739

About

Built recommendation system based on a dataset of over 4 million ratings of jokes using GraphLab


Languages

Language:Python 100.0%