haisimao/RecLearn

RecLearn

RecLearn (Recommender Learning) which summarizes the contents of the master branch in Recommender System with TF2.0 is a recommended learning framework based on Python and TensorFlow2.x for students and beginners. Of course, if you are more comfortable with the master branch, you can clone the entire package, run some algorithms in example, and also update and modify the content of model and layer. The implemented recommendation algorithms are classified according to two application stages in the industry:

matching recommendation stage (Top-k Recmmendation)
ranking recommendeation stage (CTR predict model)

Update

04/23/2022: update all matching model.

Installation

Package

RecLearn is on PyPI, so you can use pip to install it.

pip install reclearn

dependent environment：

python3.8+
Tensorflow2.5-GPU+/Tensorflow2.5-CPU+
sklearn0.23+

Local

Clone Reclearn to local:

git clone -b reclearn git@github.com:ZiyaoGeng/RecLearn.git

Quick Start

In example, we have given a demo of each of the recommended models.

Matching

1. Divide the dataset.

Set the path of the raw dataset:

file_path = 'data/ml-1m/ratings.dat'

Please divide the current dataset into training dataset, validation dataset and test dataset. If you use movielens-1m, Amazon-Beauty, Amazon-Games and STEAM, you can call method data/datasets/* of RecLearn directly:

train_path, val_path, test_path, meta_path = ml.split_seq_data(file_path=file_path)

meta_path indicates the path of the metafile, which stores the maximum number of user and item indexes.

2. Load the dataset.

Complete the loading of training dataset, validation dataset and test dataset, and generate several negative samples (random sampling) for each positive sample. The format of data is dictionary:

data = {'pos_item':, 'neg_item': , ['user': , 'click_seq': ,...]}

If you're building a sequential recommendation model, you need to introduce click sequences. Reclearn provides methods for loading the data for the above four datasets:

# general recommendation model
train_data = ml.load_data(train_path, neg_num, max_item_num)
# sequence recommendation model, and use the user feature.
train_data = ml.load_seq_data(train_path, "train", seq_len, neg_num, max_item_num, contain_user=True)

3. Set hyper-parameters.

The model needs to specify the required hyperparameters. Now, we take BPR model as an example:

model_params = {
        'user_num': max_user_num + 1,
        'item_num': max_item_num + 1,
        'embed_dim': FLAGS.embed_dim,
        'use_l2norm': FLAGS.use_l2norm,
        'embed_reg': FLAGS.embed_reg
    }

4. Build and compile the model.

Select or build the model you need and compile it. Take 'BPR' as an example:

model = BPR(**model_params)
model.compile(optimizer=Adam(learning_rate=FLAGS.learning_rate))

If you have problems with the structure of the model, you can call the summary method after compilation to print it out:

model.summary()

5. Learn the model and predict test dataset.

for epoch in range(1, epochs + 1):
    t1 = time()
    model.fit(
        x=train_data,
        epochs=1,
        validation_data=val_data,
        batch_size=batch_size
    )
    t2 = time()
    eval_dict = eval_pos_neg(model, test_data, ['hr', 'mrr', 'ndcg'], k, batch_size)
    print('Iteration %d Fit [%.1f s], Evaluate [%.1f s]: HR = %.4f, MRR = %.4f, NDCG = %.4f'
          % (epoch, t2 - t1, time() - t2, eval_dict['hr'], eval_dict['mrr'], eval_dict['ndcg']))

Ranking

Waiting......

Results

The experimental environment designed by Reclearn is different from that of some papers, so there may be some deviation in the results. Please refer to Experiement for details.

Matching

Model	ml-1m			Beauty			STEAM
Model	HR@10	MRR@10	NDCG@10	HR@10	MRR@10	NDCG@10	HR@10	MRR@10	NDCG@10
BPR	0.5768	0.2392	0.3016	0.3708	0.2108	0.2485	0.7728	0.4220	0.5054
NCF	0.5834	0.2219	0.3060	0.5448	0.2831	0.3451	0.7768	0.4273	0.5103
DSSM	0.5498	0.2148	0.2929	-	-	-	-	-	-
YoutubeDNN	0.6737	0.3414	0.4201	-	-	-	-	-	-
MIND(Error)	0.6366	0.2597	0.3483	-	-	-	-	-	-
GRU4Rec	0.7969	0.4698	0.5483	0.5211	0.2724	0.3312	0.8501	0.5486	0.6209
Caser	0.7916	0.4450	0.5280	0.5487	0.2884	0.3501	0.8275	0.5064	0.5832
SASRec	0.8103	0.4812	0.5605	0.5230	0.2781	0.3355	0.8606	0.5669	0.6374
AttRec	0.7873	0.4578	0.5363	0.4995	0.2695	0.3229	-	-	-
FISSA	0.8106	0.4953	0.5713	0.5431	0.2851	0.3462	0.8635	0.5682	0.6391

Ranking

Model	500w(Criteo)		Criteo
Model	Log Loss	AUC	Log Loss	AUC
FM	0.4765	0.7783	0.4762	0.7875
FFM	-	-	-	-
WDL	0.4684	0.7822	0.4692	0.7930
Deep Crossing	0.4670	0.7826	0.4693	0.7935
PNN	-	0.7847	-	-
DCN	-	0.7823	0.4691	0.7929
NFM	0.4773	0.7762	0.4723	0.7889
AFM	0.4819	0.7808	0.4692	0.7871
DeepFM	-	0.7828	0.4650	0.8007
xDeepFM	0.4690	0.7839	0.4696	0.7919

Model List

1. Matching Stage

Paper\|Model	Published	Author
BPR: Bayesian Personalized Ranking from Implicit Feedback\|MF-BPR	UAI, 2009	Steﬀen Rendle
Neural network-based Collaborative Filtering\|NCF	WWW, 2017	Xiangnan He
Learning Deep Structured Semantic Models for Web Search using Clickthrough Data\|DSSM	CIKM, 2013	Po-Sen Huang
Deep Neural Networks for YouTube Recommendations\| YoutubeDNN	RecSys, 2016	Paul Covington
Session-based Recommendations with Recurrent Neural Networks\|GUR4Rec	ICLR, 2016	Balázs Hidasi
Self-Attentive Sequential Recommendation\|SASRec	ICDM, 2018	UCSD
Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding\|Caser	WSDM, 2018	Jiaxi Tang
Next Item Recommendation with Self-Attentive Metric Learning\|AttRec	AAAAI, 2019	Shuai Zhang
FISSA: Fusing Item Similarity Models with Self-Attention Networks for Sequential Recommendation\|FISSA	RecSys, 2020	Jing Lin

2. Ranking Stage

Paper｜Model	Published	Author
Factorization Machines\|FM	ICDM, 2010	Steffen Rendle
Field-aware Factorization Machines for CTR Prediction｜FFM	RecSys, 2016	Criteo Research
Wide & Deep Learning for Recommender Systems｜WDL	DLRS, 2016	Google Inc.
Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features\|Deep Crossing	KDD, 2016	Microsoft Research
Product-based Neural Networks for User Response Prediction\|PNN	ICDM, 2016	Shanghai Jiao Tong University
Deep & Cross Network for Ad Click Predictions｜DCN	ADKDD, 2017	Stanford University｜Google Inc.
Neural Factorization Machines for Sparse Predictive Analytics\|NFM	SIGIR, 2017	Xiangnan He
Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks\|AFM	IJCAI, 2017	Zhejiang University\|National University of Singapore
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction\|DeepFM	IJCAI, 2017	Harbin Institute of Technology\|Noah’s Ark Research Lab, Huawei
xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems\|xDeepFM	KDD, 2018	University of Science and Technology of China
Deep Interest Network for Click-Through Rate Prediction\|DIN	KDD, 2018	Alibaba Group

Discussion

If you have any suggestions or questions about the project, you can leave a comment on Issue.
wechat：

haisimao / RecLearn

RecLearn

Update

Installation

Package

Local

Quick Start

Matching

Ranking

Results

Matching

Ranking

Model List

1. Matching Stage

2. Ranking Stage

Discussion

About

Languages