krajit / gym_solutions

Collection of Python code that solves the Gymnasium Reinforcement Learning environments, along with YouTube tutorials.

Home Page:https://www.youtube.com/playlist?list=PL58zEckBH8fCt_lYkmayZoR9XfDCW9hte

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Gymnasium (Deep) Reinforcement Learning Tutorials

Collection of Python code that solves/trains Reinforcement Learning environments from the Gymnasium Library, formerly OpenAI’s Gym library. Each solution has a companion video explanation and code walkthrough from my YouTube channel @johnnycode. If the code and video helped you, please consider:
Buy Me A Coffee

Installation

The Gymnasium Library is supported on Linux and Mac OS, but not officially on Windows. On Windows, the Box2D package (Bipedal Walker, Car Racing, Lunar Lander) is problematic during installation, you may see errors such as:

  • ERROR: Failed building wheels for box2d-py
  • ERROR: Command swig.exe failed
  • ERROR: Microsoft Visual C++ 14.0 or greater is required.

My Gymnasium on Windows installation video shows you how to resolve these errors and successfully install the complete set of Gymnasium Reinforcement Learning environments.

YouTube Tutorial:

Install Gymnasium on Windows

Beginner Reinforcement Learning Tutorials

Q-Learning - Frozen Lake 8x8

This is the recommended starting point for beginners. This Q-Learning tutorial walks through the code on how to solve the FrozenLake-v1 8x8 map. The Frozen Lake environment is very simple and straightforward, allowing us to focus on how Q-Learning works. The Epsilon-Greedy algorithm is also used in conjunction with Q-Learning. Note that this tutorial does not explain the theory or math behind Q-Learning.

Code Reference:
YouTube Tutorial:

Solve FrozenLake-v1 8x8 with Q-Learning

Q-Learning - Frozen Lake 8x8 Enhanced

This is the FrozenLake-v1 environment "enhanced" to help you better understand Q-Learning. Features:

  • The Q values are overlayed on top of each cell of the map, so that you can visually see the Q values update in realtime while training!
  • The map is enlarged to fill the whole screen so that it is easier to read the overlayed Q values.
  • Shortcut keys to speed up or slow down the animation.
Code Reference:
  • frozen_lake_qe.py
    This file is almost identical to the frozen_lake_q.py file above, except this uses the frozen_lake_enhanced.py environment.
  • frozen_lake_enhanced.py
    This is the FrozenLake-v1 environment overlayed with Q values. You do not need to understand this code, but feel free to check how I modified the environment.
YouTube Tutorial:

See Q-Learning in Realtime on FrozenLake-v1

Q-Learning - Mountain Car

This Q-Learning tutorial solves the MountainCar-v0 environment. It builds upon the code from the Frozen Lake environment. What is interesting about this environment is that the observation space is continuous, whereas the Frozen Lake environment's observation space is discrete. "Discrete" means that the agent, the elf in Frozen Lake, steps from one cell on the grid to the next, so there is a clear distinction that the agent is going from one state to another. "Continuous" means that the agent, the car in Mountain Car, traverses the mountain on a continuous road, with no clear distinction of states.

Code Reference:
YouTube Tutorial:

Solves the MountainCar-v0 with Q-Learning

Q-Learning - Cart Pole

This Q-Learning tutorial solves the CartPole-v1 environment. It builds upon the code from the Frozen Lake environment. Like Mountain Car, the Cart Pole environment's observation space is also continuous. However, it has a more complicated observation space, including the cart's position and velocity, as well as the pole's angle and angular velocity.

Code Reference:
YouTube Tutorial:

Solves the CartPole-v1 with Q-Learning

StableBaseline 3

This Stable Baselines3 tutorial solves the Humanoid-v4 MuJoCo environment with the Soft Actor-Critic (SAC) algorithm. The focus is on the usage of the Stable Baselines3 library rather than the SAC algorithm. Other algorithms used in the demo include Twin Delayed Deep Deterministic Policy Gradient (TD3) and Advantage Actor Critic (A2C).

Code Reference:
Dependency:
YouTube Tutorial:

Solves the Humanoid-v4 with StableBaseline 3

Deep Reinforcement Learning Tutorials

Deep Q-Learning (DQL) Explained - Part 1

This Deep Reinforcement Learning tutorial explains how the Deep Q-Learning (DQL) algorithm uses two neural networks: a Policy Deep Q-Network (DQN) and a Target DQN, to train the FrozenLake-v1 4x4 environment. The Frozen Lake environment is very simple and straightforward, allowing us to focus on how DQL works. The Epsilon-Greedy algorithm and the Experience Replay technique are also used as part of DQL to help train the learning agent. The code referenced here is also walked through in the YouTube tutorial. PyTorch is used to build the DQNs.

Code Reference:
Dependencies:
YouTube Tutorial:

Deep Q-Learning DQL/DQN Explained + Code Walkthru + Demo

DQL Explained - Part 2: Convolutional Neural Networks

In Part 1 (above), the Deep Q-Networks (DQN) used were straightforward neural networks with a hidden layer and an output layer. This network architecture works for simple environments. However, for complex environments—such as Atari Pong—where the agent learns from the environment visually, we need to modify our DQNs with convolutional layers. We'll continue the explanation on the very simple FrozenLake-v1 4x4 environment, however, we'll modify the inputs such that they are treated as images.

Code Reference:
Dependencies:
YouTube Tutorial:

Deep Q-Learning with Convolutional Neural Networks

(back to top)

About

Collection of Python code that solves the Gymnasium Reinforcement Learning environments, along with YouTube tutorials.

https://www.youtube.com/playlist?list=PL58zEckBH8fCt_lYkmayZoR9XfDCW9hte


Languages

Language:Python 100.0%