why the rewards are different between train and predict? (ES)

Question

why the rewards are different between train and predict? (ES)

tiger55cn opened this issue a year ago · comments

For example, if you train the data 100 times, at the end, the reward is 150%.
With exactly the same trained agent, the reward returned by buy() is 130%.
The data is the same, the weight is the same, but the rewards are different.
Is that a bug?