allentran / rl-l2t

Reinforcement learning paper + code: L2T

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Policy forward prop

allentran opened this issue · comments

states:

  • scaled prices (0, ..., -k) (x-m)/sigma (k=30?),
  • information, volume + time until next trading day
  • holdings
  • cash

actions

  • sell fraction
  • buy fraction

pipe them through some dense layers + a GRU or LSTM for the scaled price sequence, take care of action constraints, spit out actions, 2 x number of assets