aamas aamas2020 double-q-learning neuroscience-inspired-ai psychiatry pytorch q-learning reinforcement-learning

mentalRL

(image credit to HBR)

Code for our AAMAS 2020 paper:

"A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry"

by Baihan Lin (Columbia)*, Guillermo Cecchi (IBM Research), Djallel Bouneffouf (IBM Research), Jenna Reinen (IBM Research) and Irina Rish (Mila, UdeM).

*Corresponding

For the latest full paper: https://arxiv.org/abs/1906.11286

For my oral talk at AAMAS 2020: https://youtu.be/CQBdQz1bmls

All the experimental results can be reproduced using the code in this repository. Feel free to contact me by doerlbh@gmail.com if you have any question about our work.

Abstract

Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework for reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help us better understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems, as well as various neuropsychiatric conditions associated with disruptions in normal reward processing. From the computational perspective, we observe that the proposed Split-QL model and its clinically inspired variants consistently outperform standard Q-Learning and SARSA methods, as well as recently proposed Double Q-Learning approaches, on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the Pac-Man game in a lifelong learning setting across different reward stationarities.

Info

Language: Python3, Python2, bash

Platform: MacOS, Linux, Windows

by Baihan Lin, Sep 2018

Citation

If you find this work helpful, please try the models out and cite our works. Thanks!

Reinforcement Learning case (main paper):

@inproceedings{lin2020astory,
  title={A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry},
  author={Lin, Baihan and Cecchi, Guillermo and Bouneffouf, Djallel and Reinen, Jenna and Rish, Irina},
  booktitle = {Proceedings of the Nineteenth International Conference on Autonomous Agents and Multi-Agent Systems, {AAMAS-20}},
  publisher = {International Foundation for Autonomous Agents and Multiagent Systems},             
  pages     = {744-752},
  year      = {2020},
  month     = {5},
  doi       = {},
  url       = {},
}


@inproceedings{lin2019split,
  title     = {Split Q Learning: Reinforcement Learning with Two-Stream Rewards},
  author    = {Lin, Baihan and Bouneffouf, Djallel and Cecchi, Guillermo},
  booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on
               Artificial Intelligence, {IJCAI-19}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {6448--6449},
  year      = {2019},
  month     = {7},
}

Contextual Bandit case:

@article{lin2020unified,
  title={Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits, and RL},
  author={Lin, Baihan and Cecchi, Guillermo and Bouneffouf, Djallel and Reinen, Jenna and Rish, Irina},
  journal={arXiv preprint arXiv:2005.04544},
  year={2020}
}

Tasks

Markov Decision Process (MDP) example with multi-modal reward distributions
Multi-Armed Bandits (MAB) example with multi-modal reward distributions
Iowa Gambling Task (IGT) example scheme 1 and 2
PacMan RL game with different stationarities

Requirements

Python 3 for MDP and IGT tasks, and Python 2.7 for PacMan task.
PyTorch
numpy and scikit-learn

Videos of mental agents playing PacMan

AD ("Alzheimer's Disease")

ADD ("addition")

ADHD ("ADHD")

bvFTD (the behavioral variant of Frontotemporal dementia)

CP ("Chronic Pain")

PD ("Parkinson's Disease")

M ("moderate")

SQL ("Split Q-Learning")

PQL ("Positive Q-Learning")

NQL ("Negative Q-Learning")

QL ("Q-Learning")

DQL ("Double Q-Learning")

Acknowledgements

The PacMan game was built upon Berkeley AI Pac-Man http://ai.berkeley.edu/project_overview.html. We modify many of the original files and included our comparison.

About

Code for our AAMAS 2020 paper: "A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry".

aamas aamas2020 double-q-learning neuroscience-inspired-ai psychiatry pytorch q-learning reinforcement-learning

Languages

Language:Jupyter Notebook 98.3%Language:Python 1.7%