TRD

This is the code for the Top-aware Recommender Distillation with Deep Reinforcement Learning, i.e., TRD framework. Specifically, TRD is capable of absorbing the essence of state-of-the-art recommenders to further improve the performance of recommendations at the top positions.

Modules of TRD

For clarify, we use MovieLens-100k dataset as a example and treat the BPRMF method as the teacher model in the TRD framework.

Data preprocess (data_generator.py)
- Filter the datasets and split the datasets into training set, validation set and test set.
Training the teacher model (run_pair_mf_train.py)
- The teacher model can be well trained by using the historical user-item interaction data. After training, we can get the distilled knowledge (i.e., user and item embeddings as well as the basic recommendation list) from the teacher model.
Training the student model (run_trd.py)
- We treat the distilled knowledge as the input and adopt the Deep Q-Networks (DQN) [1] as the student model. The student model aims to reinforce and refine the basic recommendation list.

Example to run the codes

Firstly, we need install the dependent extensions.
```
python setup.py build_ext --inplace
```
Then we run the code to load the dataset and produce the experiment data. If you want to use other datasets, you need modify the code in data_generator.py
```
python data_generator.py
```

Next, we run the code to get the distilled knowledge from the teacher model. More details of arguments are available in help message : python run_pair_mf_train.py --help

python run_pair_mf_train.py --dataset=ml-100k --prepro=origin

And for different teacher models, we could run the code as follows:

Teacher Model	Corresponding run file
MostPop	`run_mostpop_train.py`
ItemKNN	`run_itemknncf_train.py`
BPRMF	`run_pair_mf_train.py`
Item2Vec	`run_item2Vec_train.py`
NeuMF	`run_pair_neumf_train.py`

You can also use --help command to get more arguments information.

Finally, we train the student model and generate the refined recommendation lists on test set. More details of arguments are available in help message : python run_trd.py --help
```
python run_trd.py --dataset=ml-100k --prepro=origin --method=bprmf --n_actions=20 --pred_score=0
```

Pre-requisits

Required environment

Python 3.6
Torch (>=1.1.0)
Numpy (>=1.18.0)
Pandas (>=0.24.0)

Datasets

Acknowledgements

We refer to the following repositories to improve our code:

state-of-the-art recommendation algorithms with daisyRec [2]
DDPG part with RL_DDPG_Recommendation

References

[1] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski,565et al., Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529-533.

[2] Sun, Zhu and Yu, Di and Fang, Hui and Yang, Jie and Qu, Xinghua and Zhang, Jie and Geng, Cong. Are we evaluating rigorously? benchmarking recommendation for reproducible evaluation and fair comparison. ACM RecSys, 2020.

hyllll / TRD

TRD

Modules of TRD

Data preprocess (data_generator.py)

Training the teacher model (run_pair_mf_train.py)

Training the student model (run_trd.py)