How does dreamerv2 perform on feature-based tasks?

Question

How does dreamerv2 perform on feature-based tasks?

xlnwel opened this issue 2 years ago · comments

Hello. Thanks for your interesting work!

I'm planning to use dreamerv2 on some feature-based tasks. After doing some searching, I found no one has tried to do it before. I'm wondering if there is any difficulty on doing so? What problems would you anticipate?

Danijar Hafner · Answer 1 · Thu Nov 03 2022 21:27:17 GMT+0800 (China Standard Time)

Hi, it works well :)

Xinwei Chen · Answer 2 · Fri Nov 04 2022 11:19:06 GMT+0800 (China Standard Time)

Hi, glad to hear that!

In that case, may I further ask several questions?

Why would it work well? Based on several recent analyses, such as MBPO, learning on models can result in compounding errors. That is, errors in model learning accumulate as the rollout length increases. These make me wonder why Dreamerv2 can work well in practice by learning thoroughly from imagined trajectories. For image tasks, I can understand that Dreamerv2 compresses the large observation space(which contains massive redundancy) into a small latent space and thereby enabling efficient reinforcement learning on a smaller input space. But for feature-based tasks, the redundancy is much less. In that case, what is the benefit of learning in the latent space?
Where should I take extra care when applying DreamerV2 to feature-based tasks? For example, I want to make a few modifications to DreamerV2 so as to solve a multi-agent task--overcooked, which hyperparameters should be tuned more carefully?

Danijar Hafner · Answer 3 · Sat Nov 05 2022 07:39:27 GMT+0800 (China Standard Time)

Predicting in latent space tends to have much less accumulating error, probably because there is no complex manifold to follow as closely there.

For training on proprioceptive inputs, just make sure the inputs are roughly in range -1 to +1, sometimes velocities can get very large. Default hparams should work.

Xinwei Chen · Answer 4 · Sun Nov 06 2022 13:19:11 GMT+0800 (China Standard Time)

Thanks a ton for the explanation and advice!