LOLA_DiCE

Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)

Quick results:

Results on IPD using DiCE

[lr_in=0.3, lr_out=0.2, lr_v=0.1, batch_size=128, len_rollout=150, use_baseline=True]

Results on IPD using DiCE and opponent modelling

[lr_in=0.3, lr_out=0.2, lr_v=0.1, batch_size=128, len_rollout=150, use_baseline=True] (It seems that 2 lookaheads is the most stable model with this set of hyperparameters)

Results on IPD using exact gradients

[lr_in=0.3, lr_out=0.2, batch_size=128, len_rollout=150]

Results on IPD using exact gradients and opponent modelling

[lr_in=0.3, lr_out=0.2, batch_size=128, len_rollout=150]

Authors version:

The authors of the paper have their own version (Tensorflow) available here: https://github.com/alshedivat/lola

About

Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)

MIT License

Languages

Language:Python 100.0%