boltzmann-exploration

There are 1 repository under boltzmann-exploration topic.

lkwbr / grid-qlearn
See a program learn the best actions in a grid-world to get to the target cell, and even run through the grid in real-time! This is a Q-Learning implementation for 2-D grid world using both epsilon-greedy and Boltzmann exploration policies.
boltzmann-exploration epsilon-greedy grid-world machine-learning python reinforcement-learning
Language:Python 6
lucadivit / Reinforcement_Learning_Maze_Solver
This github contains a simple OpenAi Gym Maze Enviroment and (at now) a RL Algorithm to solve it.
boltzmann-exploration epsilon-decay epsilon-greedy machine-learning maze maze-enviroment maze-generator maze-solver openai-gym openai-gym-environment policy q-learning reinforcement-learning rl-algorithm sarsa sarsa-algorithm sarsa-learning tabular-q-learning
Language:Python 3
lucadivit / Adversarial_RL_TicTacToe
adversarial-machine-learning adversarial-reinforcement-learning boltzmann-exploration epsilon-greedy q-learning q-learning-vs-sarsa qlearning reinforcement-learning reinforcement-learning-algorithms reward rewards sarsa sarsa-learning tic-tac-toe tictactoe tictactoe-game
Language:Python 1
mokeddembillel / Lunar-Lander-Deep-Expected-Sarsa
Using deep expected sarsa with tensorflow to solve the lunar lander problem with hyperparameter tuning and results analysis
reinforcement-learning expected-sarsa tensorflow2 epsilon-greedy boltzmann-exploration hyperparameter-tuning lunar-lander softmax-exploration
Language:Jupyter Notebook 1
OrestisMk / RF-Q_learning-taxi_driver--Lunanlander-Policy-gradient-
This is a project of reinforcement learning which contains two different environments. The first environment is the taxi driver problem in 4x4 space with the simple Q-learning update rule. In this task, we compared the performance of the e-greedy policy and Boltzmann policy. As a second environment, we chose the LunarLander from the open gym. For the implementation of the project, the Policy gradient has been selected.
reinforcement-learning q-learning e-greedy boltzmann-exploration policy-gradient taxi-driver lunarlander-v2
0

boltzmann-exploration

lkwbr / grid-qlearn

lucadivit / Reinforcement_Learning_Maze_Solver

lucadivit / Adversarial_RL_TicTacToe

mokeddembillel / Lunar-Lander-Deep-Expected-Sarsa

OrestisMk / RF-Q_learning-taxi_driver--Lunanlander-Policy-gradient-