benchmark ctr-prediction domain-adaptation multi-domain-recommendation multi-scenario-recommendation recommender-system

Scenario-Wise-Rec: Benchmark for Multi-Scenario Recommendation

Introduction

Scenario-Wise-Rec, an open-sourced benchmark for Multi-Scenario/Multi-Domain Recommendation. We provide 10 models across 6 different datasets.

Dataset information is listed as follows:

Dataset	Domain number	Interaction	User	Item
MovieLens	Domain 0	210,747	1,325	3,429
	Domain 1	395,556	2,096	3,508
	Domain 2	393,906	2,619	3,595
KuaiRand	Domain 0	2,407,352	961	1,596,491
	Domain 1	7,760,237	991	2,741,383
	Domain 2	895,385	171	332,210
	Domain 3	402,366	832	547,908
	Domain 4	183,403	832	43,106
Ali-CCP	Domain 0	32,236,951	89,283	465,870
	Domain 1	639,897	2,561	188,610
	Domain 2	52,439,671	150,471	467,122
Tenrec	Domain 0	64,475,979	997,263	1,365,660
	Domain 1	54,277,815	989,911	791,826
	Domain 2	1,588,512	455,636	152,601
Douban	To be added	-	-	-
Mind	To be added	-	-	-

Models information is listed as follows:

Model	model-name	Link
Shared Bottom	SharedBottom	Link
MMOE	MMOE	Link
PLE	PLE	Link
SAR-Net	sarnet	Link
STAR	star	Link
M2M	m2m	Link
AdaSparse	adasparse	Link
AdaptDHM	adaptdhm	Link
EPNet	ppnet	Link
PPNet	epnet	Link

Installation

Install via `pip`

We provide a Python package scenario_wise_rec for users. Simply run:

pip install -i https://test.pypi.org/simple/ scenario-wise-rec

Note that the pip installation could be behind the recent updates. So, if you want to use the latest features or develop based on our code, you should install via source code.

Install via GitHub (Recommended)

First, clone the repo:

git clone https://github.com/Xiaopengli1/Scenario-Wise-Rec.git

Then,

cd Scenario-Wise-Rec

then use pip to install our packages:

pip install .

Usage

We provide examples for users. See /examples/multi_domain_ranking/, and dataset samples are provided in /examples/multi_domain_ranking/data. You could directly test it by simply do:

python run_ali_ccp_ctr_ranking_multi_domain.py --model star

For Full-Dataset Download and test, refer to the following steps.

Step 1: Full Datasets Download

Four Multi-Scenario/Multi-Domain Datasets are provided. See the following Table.

Dataset	Domain Number	Users	Items	Items	Download
Movie-Lens	3	6k	4k	1M	ML_Download
KuaiRand	5	1k	4M	11M	KR_Download
Ali-CCP	3	238k	467k	85M	AC_Download
Tenrec	3	1M	2M	120M	TR_Download
Douban	3	-	-	-	DB_Download
Mind	4	-	-	-	MD_Download

Substitute the full-dataset with sampled dataset, at the same time, uncomment the code in the accordingly python-script.

Step 2: Run the code

python run_movielens_rank_multi_domain.py --model_name star --device "cuda:0" --seed 2022

Build Your Own Multi-scenario Dataset/Model

We offer two template files run_example.py and base_example.py for a pipeline to help you to process different multi-scenario dataset and your own multi-scenario models.

Instructions on processing your dataset

see run_example.py. During the function get_example_dataset(input_path) to process your dataset. Be noted the feature "domain_indicator" is the feature to indicate domains. For other implementation details refer the file.

Instructions on building your model

see base_example.py. Where you could build your own model here, where we left two spaces for users to implement scenario-shared and scenario-specific models. We also leave comments on how to format the output dimension. Please refer to the file to see more details.

Contributing

We welcome any contribution that could help improve the benchmark, please fork the repo and create a pull request. You can also open an issue if you have any questions. Don't forget to give the project a star! Thanks again!

Credits

Our code is referred to Torch-RecHub. Thanks to their contribution.

About

Benchmark for Multi-Scenario-Recommendation.