haarnoja / sac

Soft Actor-Critic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

for discrete env

ccplxx opened this issue · comments

I read the paper DIAYN just now, and can't understand how to train the DIAYN in an env with discrete actions, because SAC is for continuous env. But in the paper, some experiments are based on mountain car and inverted pendulum. Thank you

I'm not too familiar with the DIAYN implementation, maybe @ben-eysenbach can help.

Thank you, haarnoja. can SAC for discrete actions env? if it can, how?

Yeah you can use SAC with discrete actions too, but this implementation does not support them. You would need to replace the policy with softmax distribution \pi(.,s) \propto \exp Q(s,.), which you can compute exactly for finite action space.