tjevgerres / rl-book-1

Source codes for the book "Reinforcement Learning: Theory and Python Implementation"

Home Page:https://zhiqingxiao.github.io/rl-book

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

强化学习:原理与Python实现

全球第一本配套 TensorFlow 2 代码的强化学习教程书

**第一本配套 TensorFlow 2 代码的纸质算法书

现已提供 TensorFlow 2 和 PyTorch 1 对照代码

Supporting materials in English can be found here.

中文版书籍支持内容

Book

代码、勘误更新等见这里

本书特色

本书介绍强化学习理论及其 Python 实现。

  • 理论完备:全书用一套完整的数学体系,严谨地讲授强化学习的理论基础,主要定理均给出证明过程。各章内容循序渐进,覆盖了所有主流强化学习算法,包括资格迹等非深度强化学习算法和柔性执行者/评论者等深度强化学习算法。
  • 案例丰富:在您最爱的操作系统(包括 Windows、macOS、Linux)上,基于 Python 3.10、Gym 0.24 和 TensorFlow 2 / PyTorch 1,实现强化学习算法。全书实现统一规范,体积小、重量轻。第 1~9 章给出了算法的配套实现,环境部分只依赖于 Gym 的最小安装,在没有 GPU 的计算机上也可运行;第 10~12 章介绍了多个热门综合案例,涵盖 Gym 的完整安装和自定义扩展,在有普通 GPU 的计算机上即可运行。

TensorFlow 2 和 PyTorch 1 对照代码

本书深度强化学习部分新增基于 TensorFlow 2 和 PyTorch 1 的 对照实现。两个版本实现均和正文伪代码严格对应,两个版本仅在智能体部分实现不同,程序结构和智能体参数完全相同。ipynb格式见notebooks文件夹,HTML网页格式见html文件夹,两个版本内容相同。

代码已经过Python 3.10、Gym 0.24、TensorFlow 2和PyTorch 1验证。有错误请报错。

QQ群

  • QQ群:722846914(勘误报错可发此群,其他问题提问前请先Google,群主和管理员不提供免费咨询服务)
  • 多任务群:696984257(非小白群,多任务强化学习+强化元学习+终身强化学习+迁移强化学习,勘误报错勿发此群,提问前请先Google)
  • 关于入群验证问题:由于QQ的bug,即使正确输入答案,也可能会验证失败。这时更换设备重试、更换输入法重试、改日重试均可能解决问题。如果答案中有英文字母,清注意大小写。
  • 中文版书前言中给出的QQ群(935702193、243613392和948110103)已满,不再新增群成员,谢谢理解。

Reinforcement Learning: Theory and Python Implementation

The First Reinforcement Learning Tutorial Book with one-on-one mapping TensorFlow 2 and PyTorch 1 Implementation

Supporting contents for readers of English version

Check here for codes, exercise answers, etc.

Features

This is a tutorial book on reinforcement learning, with explanation of theory and Python implementation.

  • Theory: Starting from a uniform mathematical framework, this book derives the theory and algorithms of reinforcement learning, including all major algorithms such as eligibility traces and soft actor-critic algorithms.
  • Practice: Every chapter is accompanied by high quality implementation based on Python 3.10, Gym 0.24, and TensorFlow 2 / PyTorch 1. All codes are compatible with Windows, Linux, and macOS, can be run in a laptop.

Please email me if you are interested in publishing this book in other languages. English version will be published by Springer Nature.

Table of Codes

All codes have been saved as a .ipynb file in the directory "notebooks" and a .html file in the directory "html".

Chapter Environment & Closed-Form Policy Agent
2 CliffWalking-v0 Bellman
3 FrozenLake-v1 DP
4 Blackjack-v1 MC
5 Taxi-v3 SARSA, ExpectedSARSA, QL, DoubleQL, SARSA(λ)
6 MountainCar-v0 SARSA, SARSA(λ), DQN tf torch, DoubleDQN tf torch, DuelDQN tf torch
7 CartPole-0 VPG tf torch, VPGwBaseline tf torch, OffPolicyVPG tf torch, OffPolicyVPGwBaseline tf torch
8 Acrobot-v1 QAC tf torch, AdvantageAC tf torch, EligibilityTraceAC tf torch, PPO tf torch, NPG tf torch, TRPO tf torch, OffPAC tf torch
9 Pendulum-v1 DDPG tf torch, TD3 tf torch
10 LunarLander-v2 SQL tf torch, SAC tf torch, SACwA tf torch
10 LunarLanderContinuous-v2 SACwA tf torch
11 BipedalWalker-v3 ES, ARS
12 PongNoFrameskip-v4 CategoricalDQN tf torch, QR-DQN tf torch, IQN tf torch
13 BernoulliMAB-v0 UCB
13 GaussianMAB-v0 UCB
14 TicTacToe-v0 AlphaZero tf torch
15 HumanoidBulletEnv-v0 BehaviorClone tf torch, GAIL tf torch
16 Tiger-v0 VI

About

Source codes for the book "Reinforcement Learning: Theory and Python Implementation"

https://zhiqingxiao.github.io/rl-book


Languages

Language:HTML 79.7%Language:Jupyter Notebook 20.3%Language:Python 0.0%