AI-SDC / ACRO

Tools for the Automatic Checking of Research Outputs. These are the tools for researchers to use as drop-in replacements for commands that produce outputs in Stata Python and R

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

acro.crosstab revelas cell content when margins=True

zizaola opened this issue · comments

When margins=True option is passed to crosstab, the content of some unsafe cell is reveled.
code version = 0.4.5 (updated 2024-03-04 (2273a3a)

Tested with notebooks/acro_demo_v2.ipynb

acro.suppress = True

safe_table = acro.crosstab(df.recommend, df.parents)
print(safe_table)

INFO:acro:get_summary(): fail; threshold: 4 cells suppressed; 
INFO:acro:outcome_df:
----------------------------------------------------|
parents    |great_pret   |pretentious  |usual       |
recommend  |             |             |            |
----------------------------------------------------|
not_recom  |          ok |          ok |          ok|
priority   |          ok |          ok |          ok|
recommend  | threshold;  | threshold;  | threshold; |
spec_prior |          ok |          ok |          ok|
very_recom | threshold;  |          ok |          ok|
----------------------------------------------------|

INFO:acro:records:add(): output_1
parents     great_pret  pretentious   usual
recommend                                  
not_recom       1440.0       1440.0  1440.0
priority         858.0       1484.0  1924.0
recommend          NaN          NaN     NaN
spec_prior      2022.0       1264.0   758.0
very_recom         NaN        132.0   196.0

recommended:recommend row should be completely unamiable to researcher.
The cell recommended:very_recom/parents:great_pret should be unamiable to researches.

When margins is activated the value of the cell recommended:very_recom/parents:great_pret (0) is shown. This is the correct value (show the suppression if False).
Be aware that the fully deleted row of recommended:recommend is not shown.

acro.suppress = True
safe_table = acro.crosstab(df.recommend, df.parents, margins=True)
print(safe_table)

INFO:acro:get_summary(): fail; threshold: 5 cells suppressed; 
INFO:acro:outcome_df:
------------------------------------------------------------------|
parents    |great_pret   |pretentious  |usual        |All         |
recommend  |             |             |             |            |
------------------------------------------------------------------|
not_recom  |          ok |          ok |          ok |          ok|
priority   |          ok |          ok |          ok |          ok|
recommend  | threshold;  | threshold;  | threshold;  | threshold; |
spec_prior |          ok |          ok |          ok |          ok|
very_recom | threshold;  |          ok |          ok |          ok|
All        |          ok |          ok |          ok |          ok|
------------------------------------------------------------------|

INFO:acro:records:add(): output_2
parents     great_pret  pretentious  usual    All
recommend                                        
not_recom         1440         1440   1440   4320
priority           858         1484   1924   4266
spec_prior        2022         1264    758   4044
very_recom           0          132    196    328
All               4320         4320   4318  12958

Note: The marginals seems to be calculated correctly, without accounting for the hidden cells.

@zizaola Thank you for bringing this issue to our attention. I'm currently working on resolving it.

Fixed in 0..4.6

acro.suppress = True

safe_table = acro.crosstab(df.recommend, df.parents, 
                           margins=True,
                          )
print(safe_table)

produces:
image