Parsing the first matching value
artemk93 opened this issue · comments
I am using Arelle app to open xml files and double check the output from the code below
from xbrl import XBRLParser, GAAP, GAAPSerializer
xbrl_parser = XBRLParser()
xbrl = xbrl_parser.parse('aapl-20150627.xml')
gaap_obj = xbrl_parser.parseGAAP(xbrl, doc_date='20150627', context='current', ignore_errors = 0)
serializer = GAAPSerializer()
result = serializer.dump(gaap_obj)
print result
Output:
MarshalResult(data={u'liabilities': 65285.0, u'net_cash_flows_financing_continuing': 0.0, u'revenue': 0.0, u'income_tax_expense_benefit': 3796.0, u'common_shares_authorized': 0.0, u'income_from_equity_investments': 0.0, u'preferred_stock_dividends': 0.0, u'redeemable_noncontrolling_interest': 0.0, u'extraordary_items_gain_loss': 0.0, u'temporary_equity': 0.0, u'costs_and_expenses': 0.0, u'non_current_assets': 4081.0, u'net_cash_flows_discontinued': 0.0, u'net_cash_flows_investing_discontinued': 0.0, u'liabilities_and_equity': 273151.0, u'other_operating_income': 0.0, u'operating_income_loss': 0.0, u'income_before_equity_investments': 0.0, u'net_income_parent': 0.0, u'equity': 0.0, u'income_loss': 14083.0, u'cost_of_revenue': 0.0, u'operating_expenses': 5598.0, u'noncurrent_liabilities': 0.0, u'current_liabilities': 0.0, u'net_cash_flows_investing': 0.0, u'stockholders_equity': 125677.0, u'net_income_loss': 10677.0, u'net_cash_flows_investing_continuing': 0.0, u'nonoperating_income_loss': 0.0, u'common_shares_outstanding': 0.0, u'net_cash_flows_financing': 0.0, u'net_income_shareholders': 0.0, u'comprehensive_income': 9065.0, u'equity_attributable_interest': 0.0, u'commitments_and_contingencies': 0.0, u'comprehensive_income_parent': 9065.0, u'net_cash_flows_operating_discontinued': 0.0, u'comprehensive_income_interest': 0.0, u'other_comprehensive_income': 0.0, u'equity_attributable_parent': 0.0, u'assets': 3991.0, u'common_shares_issued': 0.0, u'gross_profit': 19681.0, u'net_cash_flows_operating_continuing': 0.0, u'current_assets': 0.0, u'interest_and_debt_expense': 0.0, u'net_income_loss_noncontrolling': 0.0, u'net_cash_flows_operating': 0.0}, errors={})
The problem is that every value is the first matching value in the xml file. So liabilities = 65285.0, is actually us-gaap:LiabilitiesCurrent, which comes before us-gaap:Liabilities.
Same thing with assets = 3991.0 is actually
us-gap:FiniteLivedIntangibleAssetsAccumulatedAmortization, which comes before us-gaap:Assets = 273 151 000 000.
I believe it can be solved by slightly changing part of def parseGAAP()
in xbrl.py where xbrl.find_all
is used for every value (assets, current_assets, etc)
In xbrl.py
inside function def parseGAAP()
liabilities = xbrl.find_all(name=re.compile("(us-gaap:liabilities$)", re.IGNORECASE | re.MULTILINE))
seems to solve the problem.
Same thing for assets or any other tag name
I've been working with the code, changing it slightly to look for values that I want to (I hope that is ok). If it helps I can post it here, I can also post the output that I get.
@artemk93 if you think the changes would be valuable to other people, go ahead a submit a PR with what you have and we can work on it. Thanks!
No progress on it yet @artemk93 and would gladly welcome your contributions.