cenpy-devs / cenpy

Explore and download data from Census APIs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to get all places within an area (county or state) with ACS?

MaxGhenis opened this issue · comments

Given ACS only has county and tract keys, it seems like not, but checking here.

acs = cenpy.products.ACS(2018)
acs._layer_lookup.keys()

dict_keys(['county', 'tract'])

As expected, this produces an error (though I don't think it's the error message that's intended, given the error code):

acs.from_county('Ventura County, CA', level='place')

IndexError Traceback (most recent call last)
in
----> 1 acs.from_county('Ventura County, CA', level='place')

~/miniconda3/lib/python3.7/site-packages/cenpy/products.py in from_county(self, county, variables, level, **kwargs)
626 from_csa.doc = _Product.from_place.doc.replace('place', 'CSA')
627 def from_county(self, county, variables=None, level='tract', **kwargs):
--> 628 return self._from_name(county, variables, level, 'Counties', **kwargs)
629 from_county.doc = _Product
630 .from_place.doc\

~/miniconda3/lib/python3.7/site-packages/cenpy/products.py in _from_name(self, place, variables, level, layername, return_geometry, cache_name, strict_within, return_bounds, geometry_precision)
586 'Try picking the state containing that level,'
587 ' and then selecting from that data after it is'
--> 588 ' fetched'.format(level))
589 if level == 'block':
590 raise ValueError('The American Community Survey is only administered'

IndexError: tuple index out of range

you're mixing geographies here, asking for data from a county, but then including the level=place argument. In Census parlance, "place" basically means city. If you remove that argument, e.g.
image

you get data for all tracts in ventura county. Is that what you're after?

(places don't nest with other FIPS geographies which is why you're getting the error that ACS isnt supporter for that query. i.e. its possible that places cross county borders)

reading closer, i guess you're trying to grab all the census designated places in ventura? that's not supported yet, but the universe of CDPs isn't that big. You might just grab all the data for CA and filter it down

Is it possible to get data at the place level, or is cenpy only by tract/block today? Even just a single row when calling from_place would be fine.

the simplest way, currently, would be to do a place query like

cenpy.products.ACS(2018).from_place(PLACE)

which will give you back the tract-level data for the place, then you could do a dissolve which would effectively give you place-level data. You'll have to recalculate percents and what not, but i think that would get you what you're after?

Don't tract sums often not add up to higher levels, due to Census-side data cleaning and privacy mechanisms?

im pretty sure that's not the case. the way the FIPS system encodes nesting, you should be able to aggregate up the hierarchy without issue

now, as i said before, places are a somewhat different beast because their boundaries dont always align with other census geographies. If you have tracts that cross the place boundaries, you'll obviously have a bit of error (and if you're using ACS data, those margins can be pretty wide). So in the case of places, its possible that your tract-level aggregation wouldn't match the stats tabulated by the official place boundary, but you'd still get a pretty reasonable estimate

Aggregating is possible, but you won't always get the same result as the census.

For example, I exported income data from data.census.gov for each Ventura County tract, as well as Ventura County overall. Here's the query and spreadsheet.

The counts all match, e.g. summing households and families by tract equals the total. But when recalculating percentages, you get issues. Summing up the number of families with <$10k income (by row, multiplying families by the percentage) produces an error of 0.1%, which looks like it could result from two tracts that don't have income distribution data. I imagine the difference could be bigger for other geos and/or fields. Medians also can't be reconstructed.

Is this an API issue or just hasn't been implemented here yet? Looks like tidycensus can pull out other levels.

should be possible, but i'm getting error: unknown/unsupported geography heirarchy if i try to use a place query like https://api.census.gov/data/2017/acs/acs5?get=NAME,group(B01001)&for=county:073&in=state:01+place:07000 (as shown in the api docs).

Everything works fine if i remove the place query. Any idea if i'm missing something obvious @walkerke?

Yeah, this would involve adding place to the _layer_lookup and checking what, if anything, needs to change in how the results from that are parsed. So, no, not currently supported in cenpy, but definitely should be do-able!

@knaaptime Census changed queries that cross boundaries like that by including the (or part) designation, like this:

https://api.census.gov/data/2017/acs/acs5?get=NAME,group(B01001)&for=county%20(or%20part):073&in=state:01+place:07000

That query gets you data for the part of Birmingham that is located within Jefferson County (but not in Shelby County).

That specific configuration isn't supported in tidycensus but I believe is do-able with censusapi.

ahh, sweet. thanks Kyle!