AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Do the trading environments only allow for 1 share to be held at a time?

ShinyOrbThing opened this issue · comments

Hi, I am reading through the code since I plan to use this environment for a project, and I've realised that an action is only considered a trade if we change from a buy-action sequence to a sell-action or vice versa.

If at a time point t, the agent buys, and then at the next time point t+1 it also suggests a buy action, will the agent now hold 2 shares? I haven't been able to find evidence in the code that the second buy action is considered, and that a second share is bought. It seems that once there is a buy action, all subsequent buy actions pertain to "holding"?

Could you outline how the subsequent buy orders are handled please?

Hi @ShinyOrbThing , that's right. The second buy action is actually holding. The share amount is always 1, you can buy or sell one share.

Hi @ShinyOrbThing , that's right. The second buy action is actually holding. The share amount is always 1, you can buy or sell one share.

Hi Amin, as a follow-up: It appears that the update_profit calculation is assuming an "all-in" position every time we buy.

In stocks_env.py, the profit is calculated as:

if self._position == Positions.Long:
                shares = (self._total_profit * (1 - self.trade_fee_ask_percent)) / last_trade_price

                self._total_profit = (shares * (1 - self.trade_fee_bid_percent)) * current_price

Since (ignoring ask and bid fees) the profit update is
$$\text{current profit} \times \frac{\text{current price}}{\text{last price}}$$,

If the current price is 0, then regardless of the previous profit, we now have 0 profit. So a buy position assumes we are all-in every time rather than a single share, or am I missing something? This is different from the reward calculation behaviour, which considers absolute price differences instead. So there is often a disagreement between both metrics. Please see an example below from a signal I simulated.

a2c_sine8_plot