magicye / DRL4Recsys

Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep Reinforcement Learning for Recommender Systems

This Work Collected the Courses / Books on Deep Reinforcement Learning (DRL) and DRL papers for recommender system.

(This is a fork of https://github.com/cszhangzhen/DRL4Recsys. In this fork, we updated some latest paper and added some comments or summaries)

Courses

UCL Course on RL

http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html

CS 294-112 at UC Berkeley

http://rail.eecs.berkeley.edu/deeprlcourse/

Stanford CS234: Reinforcement Learning

http://web.stanford.edu/class/cs234/index.html

Book

  1. Reinforcement Learning: An Introduction (Second Edition). Richard S. Sutton and Andrew G. Barto. book

  2. 强化学习实战:强化学习在阿里的技术演进和业务创新 ISBN:9787121338984

Papers

Search Keywords: Reinforcement, Policy, Reward ...

Survey Papers

  1. A Brief Survey of Deep Reinforcement Learning. Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, Anil Anthony Bharath. 2017. paper

  2. Deep Reinforcement Learing: An Overview. Yuxi Li. 2017. paper

  3. Deep Reinforcement Learning for Search, Recommendation, and Online Advertising: A Survey. . Sigweb 19. paper

  4. ★★★
    Reinforcement learning based recommender systems: A survey. M. Mehdi Afsar, Trafford Crump, Behrouz Far. ACM Computing Surveys. 2021. paper

  5. A Survey on Reinforcement Learning for Recommender Systems. Yuanguo Lin, Yong Liu, Fan Lin, Pengcheng Wu, Wenhua Zeng, Chunyan Miao. 2021. paper

  6. A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions. X Chen, L Yao, J McAuley, G Zhou, X Wang. 2021. paper

Conference Papers

<2018

  1. An MDP-Based Recommender System. Guy Shani, David Heckerman, Ronen I. Brafman. JMLR 2005. paper
    MDP

  2. Usage-Based Web Recommendations: A Reinforcement Learning Approach. Nima Taghipour, Ahmad Kardan, Saeed Shiry Ghidary. Recsys 2007. paper

  3. A hybrid web recommender system based on q-learning. Nima Taghipour and Ahmad Kardan. 2008. In Proceedings of the 2008 ACM symposium on Applied computing. ACM, 1164–1168. paper
    Q-learning

  4. DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation. Elad Liebman, Maytal Saar-Tsechansky, Peter Stone. AAMAS 2015. paper

  5. Online contextaware recommendation with time varying multi-armed bandit. KDD 2016

  6. ★★★
    Deep Reinforcement Learning for List-wise Recommendations. Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2017. DRL4KDD'19. paper code(author) code
    employs Actor-Critic framework to learn the optimal strategy by a online simulator
    JD

2018

  1. Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning. Jun Feng, Heng Li, Minlie Huang, Shichen Liu, Wenwu Ou, Zhirong Wang, Xiaoyan Zhu. WWW 2018. paper
    uses the multi-agent reinforcement learning to optimize the multi-scenario ranking.

  2. Reinforcement Mechanism Design for e-commerce. Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang. WWW 2018. paper

  3. ★★★
    DRN: A Deep Reinforcement Learning Framework for News Recommendation. Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, Zhenhui Li. WWW 2018. paper
    Pennsylvania State University, Microsoft
    DDQN
    no code


  4. (Workshop) Recogym: A reinforcement learning environment for the problem of product recommendation in online advertising. David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, Alexandros Karatzoglou. RecSys 2018. paper code (Application of the code) (A competition)

  5. ★★★
    Deep Reinforcement Learning for Page-wise Recommendations. Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, Jiliang Tang. RecSys 2018. paper
    JD
    Adopts RL to recommend items on a 2-D page instead of showing one single item each time.
    simulator used

  6. Why I like it: Multi-task Learning for Recommendation and Explanation. Yichao Lu, Ruihai Dong, Barry Smyth. RecSys 2018. paper
    Explanation
    No code

  7. ★★★
    Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning. Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, Dawei Yin. KDD 2018. paper
    JD
    DEERS
    considers both positive and negative feedback from users recent behaviors to help find optimal strategy.
    simulator used
    no code

  8. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation. Shi-Yong Chen, Yang Yu, Qing Da, Jun Tan, Hai-Kuan Huang, Hai-Hong Tang. KDD 2018. paper
    To mitigate the performance degradation due to high-variance and biased estimation of the reward, the paper provides a stratified random sampling and an approximate regretted reward to enhance the robustness of the model.

  9. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, Yinghui Xu. KDD 2018. paper
    introduces DPG-FBE algorithm to maintain an approximate model of the environment to perform reliable updates of value functions.

  10. A Reinforcement Learning Framework for Explainable Recommendation. Xiting Wang, Yiru Chen, Jie Yang, Le Wu, Zhengtao Wu, Xing Xie. ICDM 2018. paper

2019

  1. Top-K Off-Policy Correction for a REINFORCE Recommender System. Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, Ed H. Chi. WSDM 2019. paper reproduce
    Youtube

  2. ★★
    Generative Adversarial User Model for Reinforcement Learning Based Recommendation System. Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, Le Song. ICML 2019. paper code(author) code code(tfv2) cited by

  3. Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement Learning. Ryuichi Takanobu, Tao Zhuang, Minlie Huang, Jun Feng, Haihong Tang, Bo Zheng. WWW 2019. paper

  4. Policy Gradients for Contextual Recommendations. Feiyang Pan, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, Qing He. WWW 2019. paper

  5. ★★★
    Value-aware Recommendation based on Reinforcement Profit Maximization. Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, Yongfeng Zhang. WWW 2019.
    Value-aware

  6. Reinforcement Knowledge Graph Reasoning for Explainable Recommendation. Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, Yongfeng Zhang. SIGIR 2019. paper

  7. Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems. Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, Dawei Yin. KDD 2019. paper

  8. Environment reconstruction with hidden confounders for reinforcement learning based recommendation. Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, Jieping Ye. KDD 2019. paper

  9. Exact-K Recommendation via Maximal Clique Optimization. Yu Gong, Yu Zhu, Lu Duan, Qingwen Liu, Ziyu Guan, Fei Sun, Wenwu Ou, Kenny Q. Zhu. KDD 2019. paper

  10. Hierarchical Reinforcement Learning for Course Recommendation in MOOCs. Jing Zhang, Bowen Hao, Bo Chen, Cuiping Li, Hong Chen, Jimeng Sun. AAAI 2019. paper code


  11. Large-scale Interactive Recommendation with Tree-structured Policy Gradient. Haokun Chen, Xinyi Dai, Han Cai, Weinan Zhang, Xuejian Wang, Ruiming Tang, Yuzhou Zhang, Yong Yu. AAAI 2019. paper
    Tree-structured Policy Gradient

  12. Virtual-Taobao: Virtualizing real-world online retail environment for reinforcement learning. Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, An-Xiang Zeng. AAAI 2019. paper code (third party code) A simulator
    Cannot build my env with my dataset

  13. ★★★
    A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation. Xueying Bai, Jian Guan, Hongning Wang. NeurIPS 2019. paper code(author) code
    Model-Based

  14. Text-Based Interactive Recommendation via Constraint-Augmented Reinforcement Learning. Ruiyi Zhang, Tong Yu, Yilin Shen, Hongxia Jin, Changyou Chen, Lawrence Carin. NeurIPS 2019. paper

  15. DRCGR: Deep reinforcement learning framework incorporating CNN and GAN-based for interactive recommendation. Rong Gao, Haifeng Xia, Jing Li, Donghua Liu, Shuai Chen, and Gang Chun. ICDM 2019. paper

  16. Reinforcement Learning to Diversify Top-N Recommendation. Lixin Zou, Long Xia, Zhuoye Ding, Dawei Yin, Jiaxing Song, Weidong Liu. DASFAA 2019. link

  17. PyRecGym: a reinforcement learning gym for recommender systems. Bichen Shi, Makbule Gulcin Ozsoy, Neil Hurley, Barry Smyth, Elias Z. Tragos, James Geraci, Aonghus Lawlor. RecSys 2019. link
    A simulator

2020

  1. Pseudo Dyna-Q: A Reinforcement Learning Framework for Interactive Recommendation. Lixin Zou, Long Xia, Pan Du, Zhuo Zhang, Ting Bai, Weidong Liu, Jian-Yun Nie, Dawei Yin. WSDM 2020. paper code

  2. End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding. Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, Xiuqiang He. WSDM 2020. paper

  3. Reinforced Negative Sampling over Knowledge Graph for Recommendation. Xiang Wang, Yaokun Xu, Xiangnan He, Yixin Cao, Meng Wang, Tat-Seng Chua. WWW 2020. paper

  4. A Reinforcement Learning Framework for Relevance Feedback. Ali Montazeralghaem, Hamed Zamani, James Allan. SIGIR 2020. paper

  5. ★★★
    KERL: A Knowledge-Guided Reinforcement Learning Model for Sequential Recommendation. Pengfei Wang, Yu Fan, Long Xia, Wayne Xin Zhao, Shaozhang Niu, Jimmy Huang. SIGIR 2020. paper code
    Graph

  6. Self-Supervised Reinforcement Learning for Recommender Systems. Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Joemon Jose. SIGIR 2020. paper code
    Self-Supervised
    SQN Q-learning

  7. Reinforcement Learning to Rank with Pairwise Policy Gradient. Jun Xu, Zeng Wei, Long Xia, Yanyan Lan, Dawei Yin, Xueqi Cheng, Ji-Rong Wen. SIGIR 2020. paper

  8. ★★★
    MaHRL: Multi-goals Abstraction based Deep Hierarchical Reinforcement Learning for Recommendations. Dongyang Zhao, Liang Zhang, Bo Zhang, Lizhou Zheng, Yongjun Bao, Weipeng Yan. SIGIR 2020. paper
    (Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction. In Proceedings of SIGKDD 2019. paper )
    PKU / JD
    high-level agent and low-level agent
    Hierarchical
    Simulator used
    no code

  9. Leveraging Demonstrations for Reinforcement Recommendation Reasoning over Knowledge Graphs. Kangzhi Zhao, Xiting Wang, Yuren Zhang, Li Zhao, Zheng Liu, Chunxiao Xing, Xing Xie. SIGIR 2020. paper

  10. Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning. Sijin Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, Yong Yu. SIGIR 2020. paper

  11. Adversarial Attack and Detection on Reinforcement Learning based Recommendation System. Yuanjiang Cao, Xiaocong Chen, Lina Yao, Xianzhi Wang, Wei Emma Zhang. SIGIR 2020. paper

  12. Reinforcement Learning based Recommendation with Graph Convolutional Q-network. Yu Lei, Hongbin Pei, Hanqi Yan, Wenjie Li. SIGIR 2020. paper

  13. Nonintrusive-Sensing and Reinforcement-Learning Based Adaptive Personalized Music Recommendation. D Hong, L Miao, Y Li. SIGIR 2020. paper

  14. Joint Policy-Value Learning for Recommendation. Olivier Jeunen, David Rohde, Flavian Vasile, Martin Bompaire. KDD 2020. link(with video) paper video code
    Combine value learning and policy learning
    adopt the RecoGym simulation environment in experiments

  15. Keeping Dataset Biases out of the Simulation: A Debiased Simulator for Reinforcement Learning based Recommender Systems. Jin Huang, Harrie Oosterhuis, Maarten de Rijke, Herke van Hoof. RecSys 2020. paper

  16. Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication. Xu He, Bo An, Yanghua Li, Haikai Chen, Rundong Wang, Xinrun Wang, Runsheng Yu, Xin Li, Zhirong Wang. RecSys 2020. paper

  17. (Demo) Demonstrating Principled Uncertainty Modeling for Recommender Ecosystems with RecSim NG. Martin Mladenov, Chih-Wei Hsu, Vihan Jain, Eugene Ie, Christopher Colby, Nicolas Mayoraz, Hubert Pham, Dustin Tran, Ivan Vendrov, Craig Boutilier. RecSys 2020. paper
    (Preprint) RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems. arxiv 2021. paper code
    A Simulator
    A Further Work of RecSim Google

2021

  1. Hierarchical Reinforcement Learning for Integrated Recommendation. Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, Leyu Lin. AAAI, 2021. paper
    WeChat
    Base MaHRL high-level agent and low-level agent
    Hierarchical

  2. DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems. Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Jiliang Tang , Hui Liu. AAAI, 2021. paper

  3. Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation. Kai Wang, Zhene Zou, Qilin Deng, Jianrong Tao, Runze Wu, Changjie Fan, Liang Chen, Peng Cui. AAAI, 2021. paper

  4. A General Offline Reinforcement Learning Framework for Interactive Recommendation. Teng Xiao, Donglin Wang. AAAI, 2021. paper

  5. Cost-Effective and Interpretable Job Skill Recommendation with Deep Reinforcement Learning. Ying Sun, Fuzhen Zhuang, Hengshu Zhu, Qing He, Hui Xiong. WWW 2021. paper video
    Hui Xiong
    Multi-task RL for Rec

  6. User Response Models to Improve a REINFORCE Recommender System. Minmin Chen, Bo Chang, Can Xu, Ed Chi. WSDM 2021. paper

  7. Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning. Yang Deng, Yaliang Li, Fei Sun, Bolin Ding and Wai Lam. Sigir 2021. paper

  8. Policy-Gradient Training of Fair and Unbiased Ranking Functions. Himank Yadav, Zhengxiao Du and Thorsten Joachims. Sigir 2021. paper

  9. Counterfactual Reward Modification for Streaming Recommendation with Delayed Feedback. Xiao Zhang, Haonan Jia, Hanjing Su, Wenhan Wang, Jun Xu and Ji-Rong Wen. Sigir 2021. paper

  10. Underestimation Refinement: A General Enhancement Strategy for Exploration in Recommendation Systems. Yuhai Song, Lu Wang, Haoming Dang, Weiwei Zhou, Jing Guan, Xiwei Zhao, Changping Peng, Yongjun Bao, Jingping Shao. Sigir 2021. paper

  11. (Short Papers) RLNF: Reinforcement Learning based Noise Filtering for Click-Through Rate Prediction. Pu Zhao, Chuan Luo, Cheng Zhou, Bo Qiao, Jiale He, Liangjie Zhang and Qingwei Lin. Sigir 2021.

  12. (Short Papers) De-Biased Modeling of Search Click Behavior with Reinforcement Learning. Jianghong Zhou, Sayyed Zahiri, Simon Hughes, Surya Kallumadi, Khalifeh Al Jadda and Eugene Agichtein. Sigir 2021.

Preprint Papers

  1. Deep Reinforcement Learning in Large Discrete Action Spaces. Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin. arxiv 2015. paper code

  2. Reinforcement Learning based Recommender System using Biclustering Technique. Sungwoon Choi, Heonseok Ha, Uiwon Hwang, Chanju Kim, Jung-Woo Ha, Sungroh Yoon. arxiv 2018. paper

  3. Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling. Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, Yuzhou Zhang. arxiv 2018. paper

  4. Model-Based Reinforcement Learning for Whole-Chain Recommendations. Xiangyu Zhao, Long Xia, Yihong Zhao, Dawei Yin, Jiliang Tang. arxiv 2019. paper

  5. RecSim: A Configurable Simulation Platform for Recommender Systems. Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, Craig Boutilier. arxiv 2019. paper code
    A simulator
    google

  6. Toward simulating environments in reinforcement learning based recommendations (Simulating User Feedback for Reinforcement Learning Based Recommendations). Xiangyu Zhao, Long Xia, Lixin Zou, Dawei Yin, Jiliang Tang. arxiv 2019. paper
    A simulator
    Rejected by AAAI 2020

  7. Measuring Recommender System Effects with Simulated Users. Sirui Yao, Yoni Halpern, Nithum Thain, Xuezhi Wang, Kang Lee, Flavien Prost, Ed H. Chi, Jilin Chen, Alex Beutel. arxiv 2021. paper
    google

RL Papers

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning https://arxiv.org/pdf/2105.08140.pdf

Bayesian Q-learning https://www.aaai.org/Papers/AAAI/1998/AAAI98-108.pdf

Accepted Paper List of Top Conference

NeurIPS Proceedings
https://papers.nips.cc/

KDD 2019
https://dblp.org/db/conf/kdd/kdd2019.html

KDD 2020
https://www.kdd.org/kdd2020/accepted-papers

KDD 2021
https://kdd.org/kdd2021/accepted-papers/index

Sigir 2020
https://sigir.org/sigir2020/accepted-papers/

Sigir 2021
https://sigir.org/sigir2021/accepted-papers/

AAAI 2021 https://aaai.org/Conferences/AAAI-21/wp-content/uploads/2020/12/AAAI-21_Accepted-Paper-List.Main_.Technical.Track_.pdf

WWW 2021
https://www2021.thewebconf.org/program/papers/

ICML 2020 https://icml.cc/Conferences/2020/Schedule?type=Poster

ICML 2021 https://icml.cc/Conferences/2021/Schedule?type=Poster

RecSYS 2020
https://recsys.acm.org/recsys20/accepted-contributions/

About

Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems