haarnoja / sac

Soft Actor-Critic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sparse Reward Environments

bhairavmehta95 opened this issue · comments

Did you happen to see SAC's performance on sparse-reward environments?

I know the DIAYN paper trained on sparse rewards, but I was wondering if vanilla SAC (in your expts) had any luck solving things like Continuous MountainCar.

We haven't tried spare-reward environments with the vanilla SAC. My intuition is that it will not work any better than other RL algorithms with Gaussian/Boltzmann exploration because of lack of temporal correlation in the exploration noise.

Gotcha; that's what we seem to be seeing, but just wanted to make sure!

Could you clarify what you mean by temporal correlation in the exploration noise? Thanks.