AntonioAlgaida / Playground

In this repository I will try different algorithms and play with them.

Playground

In this repository I will try different algorithms and play with them.

I have been playing with Stable_Baselines3 and the Lunar_Lander_v2 environment.

Obtained an average reward of 270, training for 2e6 timesteps with the PPO algorithm.

In this repository I will try different algorithms and play with them.

MIT License

Language:Jupyter Notebook 100.0%Language:Python 0.0%