Deep Learning Türkiye - Reinforcement Learning Project
This repository consists projects from Deep Learning Türkiye - Reinforcement Learning Group. Enter folders to see each project's details.
1. Introduction To RL
Simple tic tac toe example. Learns via Value Function at the moment. Policy Search TODO. Benefited from tansey.
2. Multi-Armed Bandits
Provides the underlying testbed for bandit problem.
3. Finite Markov Decision Processes
Uses the OpenAI Gym. Learns via Q-Learning.
4. Temporal Difference
Multiple approaches to CartPole problem. Benefited from dennybritz.
Library usage
You can find example usage below.
import gym
from lib import q_learning_agent, double_q_learning_agent, sarsa_learning_agent
env = gym.make("FrozenLake-v0")
env.reset()
def train(agent):
for i_episode in range(1000):
state = env.reset()
while True:
action = agent.select_action(state)
next_state, reward, done, _ = env.step(action)
agent.learn(action, reward, state, next_state)
if done:
break
state = next_state
qla = q_learning_agent(epsilon=0.3, discount_factor=0.9, alpha=0.5, action_space=env.action_space.n)
sla = sarsa_learning_agent(epsilon=0.3, discount_factor=0.9, alpha=0.5, action_space=env.action_space.n)
dqla = double_q_learning_agent(epsilon=0.3, discount_factor=0.9, alpha=0.5, action_space=env.action_space.n)
train(qla)
train(sla)
train(dqla)