self-play

There are 3 repositories under self-play topic.

suragnair / alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
alpha-zero alphago alphago-zero alphazero deep-learning gobang gomoku keras mcts monte-carlo-tree-search neural-network othello pytorch reinforcement-learning self-play tensorflow tf
Language:Jupyter Notebook 3831
opendilab / DI-engine
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
reinforcement-learning multiagent-reinforcement-learning self-play imitation-learning inverse-reinforcement-learning exploration-exploitation distributed-system python impala smac atari mujoco minigrid r2d2 reinforcement-learning-algorithms pytorch-rl offline-rl drl distributed-reinforcement-learning model-based-reinforcement-learning
Language:Python 2973
opendilab / DI-star
An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.
reinforcment-learning starcraft2 self-play artificial-intelligence deep-learning league deep-reinforcement-learning
Language:Python 1215
opendilab / LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
alpha-beta-pruning alphazero atari board-game board-games continuous-control efficientzero gomoku gumbel-muzero gym mcts mcts-algorithm monte-carlo-tree-search muzero pytorch reinforcement-learning sampled-muzero self-play stochastic-muzero tictactoe
Language:Python 1050
uclaml / SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
deep-learning fine-tuning large-language-models self-play
Language:Python 966
uclaml / SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
deep-learning fine-tuning large-language-models rlhf self-play
Language:Python 462
inspirai / TimeChamber
A Massively Parallel Large Scale Self-Play Framework
deep-reinforcement-learning isaac-gym reinforcement-learning self-play multi-agent
Language:Python 200
ChuaCheowHuan / gym-continuousDoubleAuction
A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.
multi-agent-reinforcement-learning gym-environment limit-order-book high-frequency-trading ray rllib market-microstructure financial-engineering self-play double-auction ppo zero-sum-games zero-sum quantitative-finance quantitative-trading marl n-player lstm
Language:Jupyter Notebook 138
blanyal / alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
alphazero alpha-zero alphago-zero tensorflow reinforcement-learning mcts tictactoe self-play game deep-learning machine-learning resnet connect-four connect4 othello reversi tic-tac-toe deepmind
Language:Python 87
Naton1 / osrs-pvp-reinforcement-learning
Train a neural network to PvP in Old School RuneScape using reinforcement learning.
artificial-intelligence deep-learning gym java machine-learning oldschool-runescape osrs ppo python pytorch reinforcement-learning rsps runescape self-play
Language:Java 62
seungeunrho / football-paris
The exact codes used by the team "liveinparis" at the kaggle football competition ranked 6th/1141
gfootball self-play reinforcement-learning pytorch ppo liveinparis kaggle
Language:Python 58
dellalibera / td-gammon
TD-Gammon implementation
backgammon artificial-intelligence temporal-differencing-learning reinforcement-learning neural-network pytorch convolutional-neural-networks self-play value-function game
Language:Python 42
dellalibera / gym-backgammon
Backgammon OpenAI Gym
gym gym-env reinforcement-learning self-play game td-gammon openai-gym artificial-intelligence openai-gym-environment backgammon-game gym-backgammon backgammon td-learning temporal-differencing-learning
Language:Python 39
ShibiHe / Model-Free-Episodic-Control
This is the implementation of paper Model Free Episodic Control
openai-gym dqn-ep deep knn numpy self-play game-theory fictitious
Language:Python 37
cestpasphoto / alpha-zero-general
A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available
alphago alphago-zero alphazero machikoro minivilles python pytorch reinforcement-learning santorini santorini-game splendor the-little-prince numba self-play
Language:Python 35
tobiasemrich / SchafkopfRL
AI agents for the bavarian card game Schafkopf trained with reinforcement learning
schafkopf card-game ppo reinforcement-learning self-play pytorch imperfect-information-game
Language:Python 35
Sebastian-Schuchmann / Self-Play-TicTacToe-AI-ML-Agents-
A Self Play reinforcement learning Agent learns to play TicTacToe using the ML-Agents Framework in Unity.
artificial-intelligence machine-learning reinforcement-learning ml-agents unity unity3d self-play neural-network tensorflow
Language:C# 34
sirmammingtonham / alphastone
Using self-play, MCTS, and a deep neural network to create a hearthstone ai player
alpha-zero self-play monte-carlo-tree-search ismcts deep-learning deep-reinforcement-learning pytorch hearthstone ai
Language:Python 29
cmubig / sorts
Code base for Social Robot Tree Search (SoRTS).
intent-prediction mcts self-play social-navigation
Language:Python 23
af1tang / convogym
A gym environment to train chatbots.
dialog-systems chatbot-platform reinforcement-learning machine-learning nlp natural-language-processing pytorch natural-language-generation active-learning self-play convogym
Language:Python 21
mbaske / ml-selfplay-fighter
Self-Play Boxing Match made with Unity Machine Learning Agents
unity ml-agents self-play
Language:C# 21
backpropper / s2p
Code repository for On the interaction between supervision and self-play in emergent communication (ICLR 2020)
emergent-communication self-play
Language:Python 16
peldszus / alpha-zero-general-lib
An implementation of the AlphaZero algorithm for adversarial games to be used with the machine learning framework of your choice
alphazero alpha-zero reinforcement-learning self-play mcts monte-carlo-tree-search othello ray
Language:Python 12
AutumnCrocus / shadow_sim
Emulator and AI of Shadowverse
shadowverse machine-learning simulator emulator ai cardgame dcg self-play deep deep-learning imitation-learning
Language:Python 11
Jackory / RPBT
Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)
competition multi-agent-reinforcement-learning population-based-training ppo reinforcment-learning risk-sensitive-preferences self-play
Language:Python 10
ChuaCheowHuan / PBT_MARL_watered_down
My attempt to reproduce a water down version of PBT (Population based training) for MARL (Multi-agent reinforcement learning) using DDPPO (Decentralized & distributed proximal policy optimization) from ray[rllib].
pbt marl self-play population-based-training multi-agent-reinforcement-learning ray rllib pbt-marl ddppo
Language:Jupyter Notebook 8
OneUpWallStreet / TD-Gammon
Implementation of TD Gammon algorithm by Gerald Tesauro at IBM's Thomas J. Watson Research Center in Python.
machine-learning artificial-intelligence deep-learning nueral-networks python pytorch numpy reinforcement-learning backgammon self-play td-gammon
Language:Python 7
jianzhnie / RLZero
A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.
alpha-zero mcts muzero reinforcement-learning self-play multi-agent
Language:Python 6
neoyung / connect-4
A reinforcement learning agent trained without prior human knowledge
reinforcement-learning deep-q-network alphago-zero experience-replay self-play
Language:Jupyter Notebook 6
e-dong / space-war-rl
Recreating Bill Seiler's 1985 version of Space War and training RL agents with Self-Play
ai machine-learning pygame reinforcement-learning self-play pygame-wasm
Language:Python 5
navreeetkaur / AlphaGoZero
Implementation of Alpha Go Zero - Reinforcement Learning Project, COL870 @iit-delhi
reinforcement-learning alphago-zero alphago gogame mcts monte-carlo-tree-search resnet deep-reinforcement-learning deep-learning artificial-intelligence game-playing-agent self-play
Language:Python 5
novoselov-ab / ai-zero
Implementation of an AlphaGo Zero paper in one C++ header file without any dependencies
deep-learning deep-neural-networks alphago alphago-zero reinforcement-learning mcts mcts-implementations cpp mnist-nn convolutional-neural-networks self-play machine-learning
Language:C++ 5
cedrickchee / baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
reinforcement-learning openai-gym ppo deep-learning dota2-bot policy self-play openai-five machine-learning-engineering proximal-policy-optimization policy-gradient trpo algorithms
Language:Python 4
Galtvam / OthelloZero
A Smart Agent using reinforcement learning with CNN + MCTS to learn to play Othello/Reversi
reinforcement-learning mcts reversi-game othello-game self-play cnn keras
Language:Python 4
Kajune / KODOKU
Multi-agent Self-Play Reinforcement Learning Library
reinforcement-learning rllib self-play pytorch tensorflow multi-agent-reinforcement-learning
Language:Python 3
TARTRL / TARTRL
基于PyTorch的分布式强化学习框架
pytorch distributed-training reinforcement-learning game-ai multi-agent-reinforcement-learning ppo robotics self-play
Language:Python 3

self-play

suragnair / alpha-zero-general

opendilab / DI-engine

opendilab / DI-star

opendilab / LightZero

uclaml / SPIN

uclaml / SPPO

inspirai / TimeChamber

ChuaCheowHuan / gym-continuousDoubleAuction

blanyal / alpha-zero

Naton1 / osrs-pvp-reinforcement-learning

seungeunrho / football-paris

dellalibera / td-gammon

dellalibera / gym-backgammon

ShibiHe / Model-Free-Episodic-Control

cestpasphoto / alpha-zero-general

tobiasemrich / SchafkopfRL

Sebastian-Schuchmann / Self-Play-TicTacToe-AI-ML-Agents-

sirmammingtonham / alphastone

cmubig / sorts

af1tang / convogym

mbaske / ml-selfplay-fighter

backpropper / s2p

peldszus / alpha-zero-general-lib

AutumnCrocus / shadow_sim

Jackory / RPBT

ChuaCheowHuan / PBT_MARL_watered_down

OneUpWallStreet / TD-Gammon

jianzhnie / RLZero

neoyung / connect-4

e-dong / space-war-rl

navreeetkaur / AlphaGoZero

novoselov-ab / ai-zero

cedrickchee / baselines

Galtvam / OthelloZero

Kajune / KODOKU

TARTRL / TARTRL