In this scenario we will review crime data in Nashville to determine which areas are best to market our expensive security system to.
I will describe the following:
- Garnering the Data
- Creating the Report
- Data Insights
You can review the dashboard here.
-
I wanted to gather all the data in code so it's reusable for anyone who wants to collaborate.
-
In this script, you can see I did an extract, load and transformation (ELT) of the Nashville Crime Data.
-
This data was then loaded into BigQuery as a temporary table. I then used BigQuery SQL to make a few data structure modifications.
-
I then used this data and joined it to the US Census tract data to determine the geographical polygon.
-
For the report, I joined the fact table I created to census data to look at both crime statistics and income levels per geometric tract in Nashville.
-
To gain optimal insights, I created a metric that shows the delta of weighted media income by crime. This will be the key metric to determine the largest impact to communities that crime has.
-
I created a dashboard in Google Data Studio for a user to use to determine the best location for their marketing campaign. It consists of a heatmap of Nashville to show the highest concentration of weighted median income delta.
If this high-end security company knows their typical customer makes over $85k a year -- we will use the dashboard to assess which region makes the most sense to spend our marketing dollars.
-
If we use the dashboard and filter down to tracts that make over $85k, we can then rank them based on occurence of crime in the area.
-
We are looking for areas that have a higher rate of crime, assuming there may be an appetite for a security system -- but also that they are able to afford it.
-
After reviewing the data, it seems
Census Tract 134
would be a great candidate to target as it fits our criteria. Another interesting candidate would beCensus Tract 180
, while it is ranked 4th -- the population is considerably large and its income is significantly higher.
- It's important to note, I only used data from 2020 and only records that provide geo-spatial information.