zilliz-bootcamp / personalized_recommender_system

Build a personalized movie recommendation system based on paddle and Milvus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

❗❗ This repo will no longer be maintained, please visit https://github.com/milvus-io/bootcamp ❗ ❗

Personalized Recommender System Based on Milvus

Prerequisites

Environment requirements

The following table lists recommended configurations, which have been tested:

Component Recommended Configuration
CPU Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
GPU GeForce GTX 1050 Ti 4GB
Memory 32GB
OS Ubuntu 18.04
Software Milvus 0.10.0
pymilvus 0.2.13
PaddlePaddle 1.6.1

Data source

The data source is MovieLens million-scale dataset (ml-1m), created by GroupLens Research. Refer to ml-1m-README for more information.

Build a personalized recommender system based on Milvus

Follow the steps below to build a recommender system:

  1. Train the model.

    # run train.py
    $ python3 train.py

    This command generates a model file recommender_system.inference.model in the same folder.

  2. Generate test data.

    # Download movie data movies_origin.txt to the same folder
    $ wget https://raw.githubusercontent.com/milvus-io/bootcamp/0.5.3/demo/recommender_system/movies_origin.txt
    # Generate test data. The -f parameter is followed by the movie data filename.
    $ python3 get_movies_data.py -f movies_origin.txt

    The above commands generate movies_data.txt in the same folder.

  3. Use Milvus for personalized recommendation by running the following command:

    # Milvus performs personalized recommendation based on user status
    $ python3 infer_milvus.py -a <age> -g <gender> -j <job> [-i]
    # Example 1
    $ python3 infer_milvus.py -a 0 -g 1 -j 10 -i
    # Example 2
    $ python3 infer_milvus.py -a 6 -g 0 -j 16

    The following table describes arguments of infer_milvus.py.

    Parameter Description
    -a/--age Age distribution
    0: "Under 18"
    1: "18-24"
    2: "25-34"
    3: "35-44"
    4: "45-49"
    5: "50-55"
    6: "56+"
    -g/--gender Gender
    0:male
    1:female
    -j/--job Job
    0: "other" or not specified
    1: "academic/educator"
    2: "artist"
    3: "clerical/admin"
    4: "college/grad student"
    5: "customer service"
    6: "doctor/health care"
    7: "executive/managerial"
    8: "farmer"
    9: "homemaker"
    10: "K-12 student"
    11: "lawyer"
    12: "programmer"
    13: "retired"
    14: "sales/marketing"
    15: "scientist"
    16: "self-employed"
    17: "technician/engineer"
    18: "tradesman/craftsman"
    19: "unemployed"
    20: "writer"
    -i/--infer (Optional) Converts test data to vectors and import to Milvus.

    Note: -i/--infer is required when you use Milvus for personalized recommendation for the first time or when you start another training and regenerate the model.

    The result displays top 5 movies that the specified user might be interested in:

    get infer vectors finished!
    Server connected.
    Status(code=0, message='Create table successfully!')
    rows in table recommender_demo: 3883
    Top      Ids     Title   Score
    0        3030    Yojimbo         2.9444923996925354
    1        3871    Shane           2.8583481907844543
    2        3467    Hud     2.849525213241577
    3        1809    Hana-bi         2.826111316680908
    4        3184    Montana         2.8119677305221558

    Run python3 infer_paddle.py. You can see that Paddle and Milvus generate the same result.

About

Build a personalized movie recommendation system based on paddle and Milvus


Languages

Language:Python 100.0%