why the rewards are different between train and predict? (ES)
tiger55cn opened this issue · comments
For example, if you train the data 100 times, at the end, the reward is 150%.
With exactly the same trained agent, the reward returned by buy() is 130%.
The data is the same, the weight is the same, but the rewards are different.
Is that a bug?