araffin / sbx

SBX: Stable Baselines Jax (SB3 + Jax)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] TQC Entropy Coefficient

edmund735 opened this issue · comments

Important Note: We do not do technical support, nor consulting and don't answer personal questions per email.
Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

If your issue is related to a custom gym environment, please use the custom gym env template.

🐛 Bug

When running train with hyperparameter optimization with TQC (python train.py --algo tqc --optimize), it gives TypeError: TQC.__init__() got an unexpected keyword argument 'target_entropy'.

To Reproduce

Steps to reproduce the behavior.

Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.

Please use the markdown code blocks
for both code and stack traces.

import rl_zoo3
import rl_zoo3.train
from rl_zoo3.train import train
from sbx import DDPG, DQN, PPO, SAC, TD3, TQC, CrossQ

rl_zoo3.ALGOS["ddpg"] = DDPG
rl_zoo3.ALGOS["dqn"] = DQN
# See note below to use DroQ configuration
# rl_zoo3.ALGOS["droq"] = DroQ
rl_zoo3.ALGOS["sac"] = SAC
rl_zoo3.ALGOS["ppo"] = PPO
rl_zoo3.ALGOS["td3"] = TD3
rl_zoo3.ALGOS["tqc"] = TQC
rl_zoo3.ALGOS["crossq"] = CrossQ
rl_zoo3.train.ALGOS = rl_zoo3.ALGOS
rl_zoo3.exp_manager.ALGOS = rl_zoo3.ALGOS

if __name__ == "__main__":
    train()
Traceback (most recent call last): File ...
[W 2024-04-04 15:33:50,250] Trial 2 failed with parameters: {'gamma': 0.99, 'learning_rate': 0.0003250530792956964, 'batch_size': 256, 'buffer_size': 100000, 'learning_starts': 0, 'train_freq': 32, 'tau': 0.005, 'log_std_init': -1.275286479641816, 'net_arch': 'medium', 'n_quantiles': 32, 'top_quantiles_to_drop_per_net': 24} because of the following error: TypeError("TQC.__init__() got an unexpected keyword argument 'target_entropy'").
Traceback (most recent call last):
  File ".../lib/python3.10/site-packages/optuna/study/_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
  File ".../rl-baselines3-zoo/rl_zoo3/exp_manager.py", line 753, in objective
    model = ALGOS[self.algo](
TypeError: TQC.__init__() got an unexpected keyword argument 'target_entropy'

Expected behavior

A clear and concise description of what you expected to happen.
It should run hyperparameter optimization without the error

### System Info

Describe the characteristic of your environment:

  • installed from source

You can use sb3.get_system_info() to print relevant packages info:

import stable_baselines3 as sb3
sb3.get_system_info()

Additional context

Add any other context about the problem here.

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

sbx currently does not support the target_entropy argument, I opened PR #43 that adds this functionality, which should fix this problem and restore compatibility to rl-baselines3-zoo.

I see, it works now, thanks for your quick reply!