ritviksahajpal / food-system-digital-twin

Work Related to a proof of concept Food System Digital Twin

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Food Twin

Work Related to a proof of concept Food System Digital Twin. Connected to the Plotline.

Production and Consumption Data

The input data used in the Production Consumption Normalization notebook is detailed below.

Data Name Path Description Source
cdl_codes input-data/cdl-codes.csv This table has all the codes used in the Cropscape data with the class name and color alongside the code. https://www.nass.usda.gov/Research_and_Science/Cropland/docs/CDL_codes_names_colors.xls
county_crops input-data/county-crops-conus-all.csv This is a table of all US counties joined with zonal histograms of cropscape data. The result is a dataframe of all counties with the number of pixels of each cdl_code that is in each county. Zonal histogram
state_fp_codes input-data/state-fp-codes.csv List of FIP codes by state. This is used to help with geographic data joins. https://www.bls.gov/respondents/mwr/electronic-data-interchange/appendix-d-usps-state-abbreviations-and-fips-codes.htm
scd_calories input-data/stability_crop_diversity.csv Data generated in the study, Divergent impacts of crop diversity on caloric and economic yield stability, (DOI: 10.1088/1748-9326/aca2be). We use their Clean_Data.csv which is a list of calories of crops produced by state. Eventually this is what is used to convert the pixels of production to calories of production https://doi.org/10.5281/zenodo.7332106
cdl_scd_crosswalk input-data/crosswalk-cdl<>stability-crop-diversity.csv This dataset is what we use to crosswalk cdl_codes to the crop names in stablity crop diversity study data Done by hand
income_consumption input-data/income-consumption.csv This is data put together by the USDA in their report U.S. Food Commodity Consumption Broken Down by Demographics, 1994-2008. The data was converted from pdf charts to a csv file. This is a subset of this data on consumption by income bracket. 185-percent of Federal Poverty Line is the cutoff between low and high income. https://www.ers.usda.gov/publications/pub-details/?pubid=45530
lafa_calories input-data/food-availability-2007-2017.csv This was derived from a USDA data set of Loss-Adjusted Food Availability in calories. Each of the food types that were tracked in the income_consumption data were extracted from this data and grouped into the food type. For instance Apple juice, Apples, and Apples dried, were all combined by calories consumed. This data set was used to provide a ratio that adjusted consumption data from 2007 to 2017. https://www.ers.usda.gov/data-products/food-availability-per-capita-data-system/food-availability-per-capita-data-system/#Loss-Adjusted%20Food%20Availability
food_exports input-data/food-exports-2017.csv Food export and import data was retrieved from the International Trade Data API run by the US Census Bureau. More on this API can be found here Guide_to_International_Trade_Datasets.pdf https://api.census.gov/data/timeseries/intltrade/exports/porths?get=PORT,CTY_CODE,E_COMMODITY,E_COMMODITY_SDESC,AIR_VAL_MO,AIR_WGT_MO,CNT_WGT_MO,CNT_VAL_MO&YEAR=2017&SUMMARY_LVL=DET&COMM_LVL=HS6&E_COMMODITY=
food_imports input-data/food-imports-2017.csv Food export and import data was retrieved from the International Trade Data API run by the US Census Bureau. More on this API can be found here Guide_to_International_Trade_Datasets.pdf https://api.census.gov/data/timeseries/intltrade/imports/porths?get=PORT,CTY_CODE,I_COMMODITY,I_COMMODITY_SDESC,GEN_VAL_MO,AIR_VAL_MO,AIR_WGT_MO,CNT_WGT_MO,CNT_VAL_MO&YEAR=2017&SUMMARY_LVL=DET&COMM_LVL=HS6&I_COMMODITY='
imports_crosswalk input-data/import-commodities.csv This data crosswalks all import commodity names with our crop names used in the tool. The Calorie values for the import food types were derived from the USDA tool, Food Data Central https://fdc.nal.usda.gov/index.html https://fdc.nal.usda.gov/index.html
population_income input-data/input-data/ACSST5Y2021.S1701-Data.csv Population data was retrieved from the census bureau. POVERTY STATUS IN THE PAST 12 MONTHS ACS 5-Year Estimates Subject Tables for 2021 was the report that was used. In our consumption data, below 185% of poverty line is low income. Above 185% is high income. https://data.census.gov/table?q=POVERTY+STATUS+IN+THE+PAST+12+MONTHS&g=010XX00US$0500000&tid=ACSST5Y2021.S1701
county_boundaries input-data/geo-boundaries-2022/cb_2022_us_county_500k.shp Geographic boundaries of counties in 2022. Population data was joined to these boundaries. https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html
port_codes input-data/port-codes.csv This is a table charting 4-digit port codes to the cities where they are located. https://www.census.gov/foreign-trade/schedules/d/dist.txt

Output Data

This project produced several datasets, some of which are used in the final application, and others taht we've exported for use in other projects. Below is a description of each of the output datasets.

Data Name Description
county-population-consumption-production-scaled.csv Data set of predicted production and consumption of all tracked crops at a county level. Production is normalized to the consumption levels. County polygons are also joined to this data
county-population-consumption-production-scaled.geojson Data set of predicted production and consumption of all tracked crops at a county level. Production is normalized to the consumption levels. County polygons are also joined to this data
crop-key.csv Look up table for crops in the study with the most relevant cropscape crop identifier and category the crop is grouped into in the final application
food-imports-ports.csv Calorie counts of food imported through each port of entry. Includes identifying information about the port and the geocoordinates.
food-imports-ports.geojson Calorie counts of food imported through each port of entry. Includes identifying information about the port and the geocoordinates.
full-production-2017.csv Total of produced food, grown and imported, on a county level as calculated by our model. This is before the production data is scaled to consumption.
population-data.csv Demographic data on income level, age, and sex on a county level. Does not include geographic polygons of counties
population-demographics-county.geojson Demographic data on income level, age, and sex on a county level joined with polygons of those counties
production-consumption-max-min.csv The county names and Calorie counts for maximum and minimum production across all counties. For minimum values, the first county, alphabetically, that does not produce any calories of that crop is the one listed.
us-port-locations.geojson Geocoordinates of all US ports of entry along with county where they are located and port codes. The geocoordinates were derived from geocoding addresses.
unscaled-production-without-imports.csv Total of produced food, on a county level as calculated by our model. This does not include imports and is before it is scaled to consumption.
production-scale-ratios.json This is a json file of the ratio of produced crops/consumed crop. Production numbers are divided by this number to get to the

About

Work Related to a proof of concept Food System Digital Twin


Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%