reason why agent can open multiple position in the same direction.

Question

reason why agent can open multiple position in the same direction.

lorrp1 opened this issue 4 years ago · comments

is there a reason why the agents were implemented this way?
any paper that proves that it gives an advantage over the "normal" method (if bought/sold true -> only close position or nothing)?

Super Luminal · Answer 1 · Tue Jul 28 2020 04:04:41 GMT+0800 (China Standard Time)

you can change the model so that it is only trading around a threshold. which agent/agents are you referencing? Some do have a inventory feature.

Lor · Answer 2 · Tue Jul 28 2020 04:46:31 GMT+0800 (China Standard Time)

neuroevolution with novelty-search.
https://imgur.com/a/gQSyhiO
from the code and this result i really dont understand how the fitness is calculated.
are the predict selling/buying points on their own? or they are supposed to imply a buy-sell sequence? (i mean is the fitness calculated based on already closed position or not?)
or any other idea why the results seems to make no sense?

evolution-strategy-bayesian-agent results for example make more sense https://imgur.com/a/lvlB2fN

but yet not as good as the simple turtle https://imgur.com/a/uFQ7PSh

Super Luminal · Answer 3 · Tue Jul 28 2020 05:03:15 GMT+0800 (China Standard Time)

your second question does not make sense to me, but as far as the first one goes. The fitness is calculated based on each population's individual performance on a given t+1st data point(predict function), it is all about your training data and your hyperparameters that determines how it learns to open and close. From looking at your data, it seems to be highly convergent therefore the population score might keep bouncing back and forth between maximas and you might have prematurely stopped it at a bad maxima acquisition. Your dataset has the many of the exact same maximas. Give it jist the first oscillation and see if that changes how it learns.

Super Luminal · Answer 4 · Tue Jul 28 2020 05:07:30 GMT+0800 (China Standard Time)

If you are interested in learning how the network learns time-series data I would read through this research paper where it explains how the learning happens. Novelty Search for Deep Reinforcement Learning Policy