dszeto / predictionio-engine-ur

Universal Recommender optimized for deployment to Heroku

Home Page:http://actionml.com/universal-recommender

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PredictionIO Universal Recommender for Heroku

A fork of the Universal Recommender version 0.5.0 deployable with the PredictionIO buildpack for Heroku. Due to substantial revisions to support Elasticsearch on Heroku, this fork lags behind the main UR; differences are listed in the UR release log.

The Universal Recommender (UR) is a new type of collaborative filtering recommender based on an algorithm that can use data from a wide variety of user taste indicators—it is called the Correlated Cross-Occurrence algorithm. …CCO is able to ingest any number of user actions, events, profile data, and contextual information. It then serves results in a fast and scalable way. It also supports item properties for filtering and boosting recommendations and can therefor be considered a hybrid collaborative filtering and content-based recommender.

upstream docs

The Heroku app depends on:

Demo Story 🐸

This engine demonstrates recommendation of items for a mobile phone user based on their purchase history. The model is trained with a small example data set.

The users and items are tagged with platform (ios and/or android) and the ID's partitioned logically to make it easier to interpret results:

  • 0xx => ios & android
  • 1xx => android-only
  • 2xx => ios-only

How To 📚

✏️ Throughout this document, code terms that start with $ represent a value (shell variable) that should be replaced with a customized value, e.g $ENGINE_NAME

  1. ⚠️ Requirements
  2. 🚀 Demo Deployment
    1. Create the app
    2. Configure the app
    3. Provision Elasticsearch
    4. Provision Postgres
    5. Import data
    6. Deploy the app
    7. Scale up
    8. Retry release
    9. Diagnostics
  3. 🎯 Query for predictions
  4. 🛠 Local development
    1. Import sample data
    2. Run pio
    3. Query the local engine
  5. 🎛 Configuration options

Requirements

Demo Deployment

Adaptation of the normal PIO engine deployment.

Create the app

git clone \
  https://github.com/heroku/predictionio-engine-ur.git \
  pio-engine-ur

cd pio-engine-ur

heroku create $ENGINE_NAME
heroku buildpacks:add https://github.com/heroku/predictionio-buildpack.git

Configure the app

heroku config:set \
  PIO_EVENTSERVER_APP_NAME=ur \
  PIO_EVENTSERVER_ACCESS_KEY=$RANDOM-$RANDOM-$RANDOM-$RANDOM-$RANDOM-$RANDOM \
  PIO_UR_ELASTICSEARCH_CONCURRENCY=1

Provision Elasticsearch

heroku addons:create bonsai --as PIO_ELASTICSEARCH --version 5.1
  • Verify that Elasticsearch is really version 5.x.
  • Some regions provide newer versions, like --version 5.3.

Provision Postgres

heroku addons:create heroku-postgresql:hobby-dev
  • Use a higher-level, paid plan for anything but a small demo.
  • hobby-basic is the smallest paid heroku-postgresql plan

Import data

Initial training data is automatically imported from data/initial-events.json.

👓 When you're ready to begin working with your own data, read about strategies for importing data.

Deploy the app

git push heroku master

# Follow the logs to see training & web start-up
#
heroku logs -t

⚠️ Initial deploy will probably fail due to memory constraints. Proceed to scale up.

Scale up

Once deployed, scale up the processes to avoid memory issues:

heroku ps:scale \
  web=1:Standard-2X \
  release=0:Performance-L \
  train=0:Performance-L

💵 These are paid, professional dyno types

Retry release

When the release (pio train) fails due to memory constraints or other transient error, you may use the Heroku CLI releases:retry plugin to rerun the release without pushing a new deployment:

# First time, install it.
heroku plugins:install heroku-releases-retry

# Re-run the release & watch the logs
heroku releases:retry
heroku logs -t

Query for predictions

Once deployment completes, the engine is ready to recommend of items for a mobile phone user based on their purchase history.

Get all recommendations for a user:

# an Android user
curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{"user": "100"}'
# an iPhone user
curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{"user": "200"}'

Get recommendations for a user, excluding phones:

curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{
            "user": "100",
            "fields": [{
              "name": "category",
              "values": ["phone"],
              "bias": 0
            }]
          }'

Get accessory recommendations for a user excluding phones & boosting power-related items:

curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{
            "user": "100",
            "fields": [{
              "name": "category",
              "values": ["phone"],
              "bias": 0
            },{
              "name": "category",
              "values": ["power"],
              "bias": 1.5
            }
          }'

For a user with no purchase history, the recommendations will be based on popularity:

curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{"user": "000"}'

Get recommendations based on similarity with an item:

curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{"item": "101"}'

Get recommendations for a user boosting on similarity with an item:

curl -X "POST" "http://$ENGINE_NAME.herokuapp.com/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{
            "user": "100",
            "item": "101"
          }'

👓 See the main Universal Recommender query docs for more parameters. Please note those docs have been updated for the newest version 0.6.0, but this repo provides version 0.5.0. Differences are listed in the UR release log.

Local Development

Start in this repo's working directory. If you don't already have it cloned, then do it now:

git clone \
  https://github.com/heroku/predictionio-engine-ur.git \
  pio-engine-ur

cd pio-engine-ur

Next, setup local development including Elasticsearch using the buildpack.

Import sample data

bin/pio app new ur
PIO_EVENTSERVER_APP_NAME=ur data/import-events -f data/initial-events.json

Run pio

bin/pio build
bin/pio train -- --driver-memory 2500m
bin/pio deploy

Query the local engine

curl -X "POST" "http://127.0.0.1:8000/queries.json" \
     -H "Content-Type: application/json" \
     -d $'{
            "user": "100",
            "fields": [{
              "name": "category",
              "values": ["phone"],
              "bias": 0
            }]
          }'

Configuration

  • PIO_UR_ELASTICSEARCH_CONCURRENCY
    • defaults to 1
    • may increase in-line with the Bonsai Add-on plan's value for Concurrent Indexing
    • the max for a dedicated Elasticsearch cluster is "unlimited", but in reality set it to match the number of Spark executor cores

About

Universal Recommender optimized for deployment to Heroku

http://actionml.com/universal-recommender

License:Apache License 2.0


Languages

Language:Scala 97.1%Language:Shell 2.9%