A framework for Reinforcement Learning research.
- 💡 Perfect to understand and prototype algorithms:
- One algorithm = One directory -> No backtracking through parent classes
- Algorithms can be easily copied out of RL-X
- ⚒️ Known DL libraries: Implementations in PyTorch, TorchScript or JAX (Flax)
- ⚡ Maximum speed: JAX versions utilize JIT compilation -> A lot faster than PyTorch
- 🧪 Mix and match and extend: Generic interfaces between algorithms and environments
- 📈 Experiment tracking: Console logging, Saving models, Tensorboard, Weights and Biases
- Proximal Policy Optimization (PPO) in PyTorch, TorchScript, Flax
- Early Stopping Policy Optimization (ESPO) in PyTorch, TorchScript, Flax
- Soft Actor Critic (SAC) in PyTorch, TorchScript, Flax
- Randomized Ensembled Double Q-Learning (REDQ) in Flax
- Dropout Q-Functions (DroQ) in Flax
- Truncated Quantile Critics (TQC) in Flax
- Aggressive Q-Learning with Ensembles (AQE) in Flax
- Maximum a Posteriori Policy Optimization (MPO) in Flax
Most of them have only one reference environment implemented. To try out more just change the environment name in the create_env.py files or add a proper new directory for it.
For further infos on how to add more environments and algorithms read the respective README files.
git clone git@github.com:nico-bohlinger/RL-X.git
pip install -e .
cd experiments
python experiment.py
Detailed instructions can be found in the README file in the experiments directory.