amor71 / LiuAlgoTrader

Framework for algorithmic trading

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

code improvement

amor71 opened this issue · comments

in data_loader.py: a merge between _fetch_data_range() and fetch_data_range() would improve code quality (need to make sure all flows are covered by unit tests)

data_loader is a central component in the Framework, it provides a Pandas DataFrame-like interface for the in-memory management of OHLC and additional data - per symbol. Data is loaded from a data provider. Current data providers are Alpaca, Polygon). There are unit tests in the tests folder (see .../tests/test_alpaca_data_loader.py) that check various flows. When making the above changes, need to make sure the changes are covered by unit tests, and if not add additional unit tests in

commented

After close inspection of two methods _fetch_data_range() and fetch_data_range() It seems, that they are exactly the same. They both provide market data for specified time range and store on class SymbolData attribute symbol_data. The only place where _fetch_data_range() is used is in fetch_data_timestamp() method which converts different timestamps representations and specify time range. In my understanding these two methods could be merged without any consequences and there won't be any new logical code flows.

@amor71 I'd like your input on this matter, before I start to mess around with code :)

@ksilo There are subtle differences between them, and also additional things to look at:

  1. fetch_data_range would break the range into "smaller pieces" since some data providers have a limit on the amount of data they can return. I am not sure if the way to split it up (time vs amount) is the most efficient one, It makes sense that fetch_data_range() and re-use _fetch_data_range() if that's what you mean.
  2. I am not sure that way to handle time-scale is really the most pythonic and most efficient way,
  3. need to revisit how they both "stitch" together the loaded data, to make sure no duplicates, the order is being kept in the most efficient DataFrame way
commented

How big is a typical data range of fetch_data_range()? months? years?

Stale issue message