Confusion: Why are you adding 1 here?
AdityaGudimella opened this issue · comments
When calculating the scaled log_std in SAC policy, you scale log_std + 1
to the range [LOG_STD_MIN, LOG_STD_MAX]
. Is this because the range of the tanh
function is [-1, 1]
?
Is it really necessary? Wouldn't the scaling limit the output range to [LOG_STD_MIN, LOG_STD_MAX]
even without that?
yes so in the end i want a linear transformation which maps (-1, 1)
to (LOG_STD_MIN, LOG_STD_MAX)
and since this transformation is unique any other form of it will turn out to be the same formula as I have...
The way I, and the original author I suppose, thought of it was to scale (-1, 1)
-+1
-> (0, 2)
- *.5
-> (0, 1)
-> stretch it -> (LOG_STD_MIN, LOG_STD_MAX)
. Hope that helps!