Ex 10.6
tomasruizt opened this issue · comments
tomasruizt commented
Hi :)
I'm wondering about going from the first line to the second.
If r(π) = 0.5
, and E[R_t+1 | S0 = A]
is either 1 or 0. (correct me if I'm wrong)
How can E[R_t+1 | S0 = A] - r(π)
be (-0.5)^t
?
For example, for t = 0: E[R_t+1 | S0 = A] - r(π) = 1 - 0.5 = 0.5
and (-0.5)^t = 1
Am I missing somethign?
YIFAN WANG commented
Thanks for your response. You have successfully found a typo.
I should have written (-1)^t / 2 instead of (-1/2)^t.
I will fix that in a min. :)