datadesk / census-data-aggregator

Combine U.S. census data responsibly

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

provide check that spatial aggregation doesn't induce spurious patterns

sastoudt opened this issue · comments

From this paper:

"one can induce geographic patterns in the aggregate data that do not
exist in the input data"

Create a diagnostic to check for this (equations 2 and 3 in paper):

"The statistic S_j measures whether the region-level estimates for a given variable are within the margins of error of their constituent tracts. If a region-level estimate is within the margin of error of all its constituent tracts, then there is no information lost through aggregation; information loss increases as the 90 percent confidence intervals of more and more tract-level estimates do not overlap with the region’s estimate."

helper function here