Kaixhin / PlaNet

Deep Planning Network: Control from pixels by latent planning with learned dynamics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hyperparameter ballpark for symbolic envs

jendelel opened this issue · comments

Hi,

Thanks for porting PlaNet to PyTorch. Great work! I would like to use PlaNet for planning in maze-like environments. However, I can't get it to work reasonably even on classic gym Mujoco tasks such as Ant-v2. Have you tried running on symbolic environments? There are many hyperparameters and I don't know where to start tuning.

This is what I tried:

    Options
                          id: Ant
                          seed: 1
                          disable_cuda: False
                          env: Ant-v2
                          symbolic_env: True
                          max_episode_length: 1000
                          experience_size: 1000000
                          activation_function: relu
                          embedding_size: 64
                          hidden_size: 64
                          belief_size: 4
                          state_size: 30
                          action_repeat: 1
                          action_noise: 0.3
                          episodes: 1000
                          seed_episodes: 5
                          collect_interval: 20
                          batch_size: 50
                          chunk_size: 50
                          overshooting_distance: 50
                          overshooting_kl_beta: 0
                          overshooting_reward_scale: 0
                          global_kl_beta: 0
                          free_nats: 3
                          bit_depth: 5
                          learning_rate: 0.001
                          learning_rate_schedule: 0
                          adam_epsilon: 0.0001
                          grad_clip_norm: 1000
                          planning_horizon: 12
                          optimisation_iters: 10
                          candidates: 1000
                          top_candidates: 100
                          test: False
                          test_interval: 25
                          test_episodes: 10
                          checkpoint_interval: 50
                          checkpoint_experience: False
                          models: 
                          experience_replay: 
                          render: False

Thanks in advance for you help,

Lukas

Sorry, I've not really tried to run PlaNet on other domains, so don't have any ideas. Since this is a port, you should be able to open an issue on the original repository and see if the original author can provide some guidance.

@jendelel , I also try to use PlaNet/Dreamer to run symbolic envs like Ant-v2 and HalfCheetah-v2. I would like to ask if you have found suitable hyperparameters? Thanks a lot!

Hi, not really! I moved away from Planet back then. If you're working with it now, I'd highly recommend using DreamerVX over Planet. It's a series of follow up papers.