csingley / ibflex

Python parser for Interactive Brokers Flex XML statements

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handling of unsupported types

dvicini opened this issue · comments

Hi,

I've just started to play around with this package. One thing I encountered (as many others it seems) is that the parser fails violently if the flex query results contains unknown fields. I was wondering if there would be a way to change this behavior, maybe optionally, to simply ignore unknown fields? The IBKR web interface allows to very easily add all available fields to a query, and manually figuring out which to enable for the parser and which to disable appears a bit tricky. Many of the unsupported fields seem to be details that are maybe not interesting in most use cases, so it probably would make sense for the code to fail only once the user tries to access such an attribute.

Any thoughts on this? I guess this is more of a general feature request than a specific issue.

I mean, you could just check in the new types. I've gone to some trouble to make adding types as easy as possible.... surely easier than logging into IBKR web app and twiddling the query?

Parser strictness is a feature not a bug in my view. It just doesn't come free, when IBKR SWEs are constantly at work extending the schema willy-nilly.

Yes I agree, adding new types is indeed straightforward and easier than twiddling the query in the slow web interface. I also agree that it's a bit frustrating that IBKR constantly extends their schema it seems. Hence my suggestion of simply ignoring unknown fields. Again, you are right that a parser generally should be strict to prevent bugs, but I think this can be relaxed a bit when it comes to a query having fields that are new and unknown. I don't think that ignoring new fields is much less robust than what happens now, i.e. new types having to be added to the classes gradually (without much testing etc.). It's not like there is much logic in these type declarations to begin with. One could potentially even automatically generate these type declarations from a downloaded example query result.

In the end up to you, this is your project :)

Autogenerating the types would absolutely be the move - unfortunately there's just not enough there for a machine to reasonably inter the type of data primitives (str vs Decimal, or god forbid Enum).

The initial design choice to bomb out the parser wasn't unique to type misses... it's just what you do when writing parsers from scratch against underspecified input, in order to feel out the problem space & find categories of error. Upon reflection, I guess I am pleased that the only bugs I ever see any more are due to IBKR sneaking in new data types.

But if we were to change the error handling, let the parser lose information and just continue... before very long the schema drift would be significant. Eventually we'd probably start seeing bug reports from people who'd downloaded Flex queries, stored the parsed data within (e.g.) sqlite, and hadn't seen the need to retain the XML source, which was now unavailable b/c out of IBKR retention policy. Then, sometime later, theyd want to run some new analysis on prior periods, only to discover that the information hadn't been squirrelled away against a rainy day.

Permanent data loss makes people upset, and it never seems to calm them to point at disclaimers you've posted in large font in the README, or the flamboyantly punctuated warnings you've flashed at them as you hand them a loaded footgun.

I understand it is a great service to users to encapsulate these complexities & allow them not to think about it. However there is a kind of promise implicit in that model - a promise I'd be sure to break.

I suppose I'd tell you to take it up with IBKR. The onus for providing some sort of machine-readable schema to validate against rightly falls on them, not me. If enough paying customers feel strongly that their SWEs should follow proper engineering procedures here, I'm sure the situation would improve. I'd be happy to join my voice, but these kinds of issues aren't really on the front burner for me ATM

The right way to do this is to get devs from a few funds & RIAs, who in aggregate can speak for O(US$10^8) net assets, on the same page to drive this change with the broker.

I drafted PR to address some of this drift, at least for Paxos integration right now. Thanks for the good work!

And thank you for putting in the work here - sorry for the delay in reviewing.