A collection of me playing around with Reinforcement Learning and other stuff. See more on my blog.
- Q Learning
- Deep Q Learning
- REINFORCE
- REINFORCE with Baseline
- Actor-Critic TD(0)
- Actor-Critic Forward-view TD(λ)
- Actor-Critic Backward-view TD(λ)
- Off-Policy Actor Critic
- Compatible Off-Policy Deterministic Actor-Critic with Q Critic
- Deep Deterministic Policy Gradient
- Proximal Policy Optimization
- Deep Reinforcement Learning that Matters
- Implemmentation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
- What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
- 8Queens_GA: Genetic Algorithm: 8 Queens Problem
- QLearning_DQN: A First Look at Reinforcement Learning
- Atari_DQN: Reinforcement Learning: Deep Q-Learning with Atari Games
- REINFORCE: Reinforcement Learning: An Introduction to Policy Gradients
- REINFORCE-Continuous: Policy Parameterization for a Continuous Action Space
- REINFORCE-Baseline: Policy Gradients: REINFORCE with Baseline
- Off-Policy_Policy_Gradient: Actor-Critic: Off-Policy Actor-Critic Algorithm
- Actor-Critic: Value Function Approximations
- Actor-Critic_TD_0: Actor-Critic: Implementing Actor-Critic Methods
- Actor-Critic_TD_Lambda_Forward: Actor-Critic: Implementing Actor-Critic Methods
- Actor-Critic_TD_Lambda_Backward: Actor-Critic: Implementing Actor-Critic Methods
- Off-Policy_Actor-Critic: Actor-Critic: Off-Policy Actor-Critic Algorithm
- PPO_Discrete: Policy Optimizations: TRPO/PPO
- ROMs of Atari games I've used in my code. Note that with the latest version of OpenAI's gym, you need to import ROMs manually to run Atari environments.
- An implementation of CartPole with continuous action space by iandanforth