AI-SDC / ACRO

Tools for the Automatic Checking of Research Outputs. These are the tools for researchers to use as drop-in replacements for commands that produce outputs in Stata Python and R

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

crosstab function does not behave correctly when passed a list of aggfuncs

jim-smith opened this issue · comments

e.g. passing aggfunc=['mean',std'] in the argument works in pd.crosstab but not in acro.crosstab.

However the functionality is there to fix it, just needs:

  1. the line of code change from get_aggfunc to get_aggfuncs acro.py line 164
  2. Because pandas doesn't write all the aggregates into one cell, but produces different columns for them, it produces a table with Len(aggfuncs) times the normal number of columns.
    So, if aggfuncs is a list, then after you make the change above, it then throws an error when trying to apply a mask because those assume that the table only has one agg func. I think that the answer to is allow 'Freq' as an aggregation function and then make the masks bigger that way.
    I.e. if a user asks for mean and std, then when the table values are created for masking, use ['freq', X] (for I think any valid statistic x) and only look at the first table.shape[1]/2 columns