NewYorkCityCouncil / councilcount

Census data estimated for Council Districts

Home Page:https://datateam.council.nyc.gov/councilcount/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Overview

The councilcount package allows easy access to population data for around 70 demographic groups across various NYC geographic boundaries. This data was pulled from the 2017-2021 5-Year American Community Survey. For geographic boundaries that are not included in the ACS, like council districts, estimates were generated.

Installation

You can install the released version of councilcount from GitHub

remotes::install_github("newyorkcitycouncil/councilcount")

Load Package

library(tidyverse)
# load last
library(councilcount)

Vignette

For demos of the functions included in councilcount, please visit vignettes/councilverse.Rmd.

Quick Start

First load the councilcount package as above.

Functions

R

councilcount includes 3 functions:

  • get_bbl_estimates() – Generates a dataframe that provides population estimates at the point level (there are also columns for various other geographies, like council district)
  • get_census_variables() – Provides information on all of the ACS demographic variables that can be accessed using get_geo_estimates() via variable codes
  • get_geo_estimates() – Creates a dataframe that provides population estimates for selected demographic variables along chosen geographic boundaries (e.g. council district, borough, etc.)

Simply run get_bbl_estimates() and get_census_variables() to access the desired dataframes. They do not require any inputs.

get_geo_estimates() has 3 parameters:

  • geo – The desired geographic region. Please select from the following list:
    • Council Distrist: “councildist”
    • Community Distrist: “communitydist”
    • School District: “schooldist”
    • Police Precinct: “policeprct”
    • Neighborhood Tabulation Area: “nta”
    • Borough: “borough”
  • var_codes – The desired demographic group(s), as represented by the ACS variable code. To access the list of available demographic variables and their codes, please run get_census_variables()
  • boundary_year – If “councildist” is selected, the boundary year must be specified as 2013 or 2023. The default is 2013.

Here is an example, in which codes for “Female” and “Adults with Bachelor’s degree or higher” are used. The data is requested along 2023 Council District boundaries.

vars <- c('DP05_0003PE', 'DP02_0068E')
get_geo_estimates(geo = "councildist", var_codes = vars, boundary_year = "2023") 

Python

The equivalent functions are also available in Python. To access them, use the following code (Note: you must have the councilcount package downloaded on your computer):

import sys
my_path = 'INSERT PATH' # set absolute path to /councilcount/inst/python location (example: '/Users/jsmith/Desktop)
sys.path.insert(0, my_path + "/councilcount/inst/python/")
from retrieve_estimates import get_bbl_estimates, get_census_variables, get_geo_estimates

get_bbl_estimates() and get_census_variables() function the same in both R and Python. However, get_geo_estimates() has some differences in Python. Instead of having separate parameters for geo and boundary year, there are two input options for Council Districts, “councildist13” and “councildist23.” Data for New York City as a whole is also available using “nyc” for geo. Otherwise, the geo input options are the same. There are also two additional parameters, polygons and download, with the defaults set at False. If polygons is set to True, the dataframe will include a column with the geometries associated with each geographic region. If download is set to True, the dataframe will automatically download as a CSV when the function runs.

Data Sources

Methodology

Estimates for around 70 ACS demographic variables were generated for the dashboard. Estimates are available at Council District, Community District, School District, Police Precinct, Neighborhood Tabulation Area, Borough, and New York City levels. CouncilCount utilizes the 5-Year ACS, meaning the data points presented on the dashboard represent 5-year averages for the listed demographic variables. Using the multiyear estimates increases the statistical reliability of the data, especially for small population subgroups and regions with low populations.

These estimates were generated using the 2007-2011, 2012-2016, and 2017-2021 ACS 5-Year Estimates Data Profiles datasets, which provide demographic estimates by census tract. Estimates for some geographies, like neighborhood tabulation areas, which are built from census tracts, may be generated by directly aggregating census-tract-level data. However, this method does not work for geographies that have no relation to census tracts, like council districts and police precincts. In order to generate estimates for such geographies, ACS demographic data was synthesized with building data from the 2011, 2016, and 2021 PLUTO datasets to approximate the distribution of subpopulations around the city for each time period. For more information on the method used to generate the demographic estimates presented on CouncilCount, please contact datainfo@council.nyc.gov.

About

Census data estimated for Council Districts

https://datateam.council.nyc.gov/councilcount/


Languages

Language:R 55.7%Language:Python 44.3%