A Tarbell project that publishes to a P2P HTML Story.
This is a refresh of the "band-aid" shootings page which used leaflet and ai2html. This reconception dynamically builds charts and HTML blobs use python script and NodeJS. Simply run npm run build
to refresh the app with new data and publish it.
These are the things that happen when npm run build
is called:
- Fresh data is downloaded and cleaned from NewsroomDB
- A couple of new data files are created for specific elements that require a different, or more condensed, data structure. This is done using NodeJS.
- Several HTML subtemplates are built and inserted into the subtemplates folder.
- Sass and JS are processed, minified and bundled.
The data fetching, cleaning and processing are all automated using python and node. Several HTML partials (Jinja subtemplates) also are build dynamically with node so that they may use the most current data and/or dates. To do all this stuff, simply run npm run build
. Each of the aforementioned steps has it's own npm script (in the package.json) and they are wired together in the proper order using the build
script.
Test locally then tarbell publish production
. For now, to safeguard against a NewsroomDB failure, we should commit and push the new raw-data.csv
file. Technically it's generated content but we need that file for the rest of the machine to operate.
NOTE: The data-fetching script is mostly about processing the NewsroomDB data. As such, it can pull the CSV automatically, or can ingest a local copy of the same CSV. See the section on data fetching for details.
All shootings and crimes are entered by the break news center to an online form on News room DB that includes details about each shooting/crime incident. The fields for each category (crime vs shootings) are filled differently. The current section describes information only about shootings data. For each shooting incident the break news team reports the following information: RD Number Date Day Time UCR Last Name First Name Sex Age DOB Home Address Home Specificity Shooting Location Geocode Override Shooting Specificity District Hospital 1 Hospital 2 Link Link 2 Link 3 Notes Computed time
The main issue with this data is that there is no documentation on what some columns represent (e.g., computed time ). In addition, the data entry for Age, DOB (date of birth), and First Name/Last Name is very messy. To be more specific, Age is sometimes entered as a string of characters (e.g.,twenty five) than numbers (e.g.,25). Some instances involve a combination of both (e.g., 20s or 20S to refer to twenties). This field should only be including numbers from now on to make future analyses much easier. Although we have the victim's age, but in most instances we don't have his/her date of birth. Regarding the victim's name, the process of filling missing data (in this case victims with no names) is inconsistent. Breaking News data entery team have sometimes used 'John Doe' as a fake name to report an incidient in which a victim's name is not known. This also need to be documented and reportes should come up with a unique name to fill the missing values.
This app uses two python scripts, in the dynamic
folder to grab, clean and process data from NewsroomDB automatically.
NOTE: If, for some reason, the script can't reach the NewsroomDB csv directly, you can download it manually and feed it to the fetcher scripts, which will clean it up. Details on this are later in this document.
There are a couple of python dependencies for the data-fetching scripts. They should be installed when you run npm install
on the whole project, but if they aren't, then run pip install -r dynamic/requirements.txt
which should take care of that.
The purpose of shootings_script_Dynamic.py is to provide a dynamic way of gathering, cleaning, and analyzing shootings data available on newsroomDB. Next, I'll describe how the script works.
The python script crawls NewsroomDB and gets the most up-to-date version of the shootings data that is available on the website. Next, it cleans the Date column so that the formating becomes more python friendly. Next, it calculates the cumulative sum of the number of shootings for each day in each month of ever year. The calcuation will be output in a csv file format and it's called number_of_shootings_up_to_DATE (e.g., number_of_shootings_up_to_26/07/2017).
The scripts can be run on their own, or as part of the npm run build
command.
The script is written in a way so that anyone who is running it will have the freedom to choose the path where the CSV file mentioned above needs to be saved to.
The following is an example of the python command:
python shootings_script_Dynamic.py path/to/desired/directory/file.csv
If the CSV is not accessible programmatically, the NewsroomDB file can be input manually by adding a second argument the script call.
python shootings_script_Dynamic.py path/to/desired/output/file.csv path/to/input/newsroomdb-file.csv
BOILERPLATE README FOLLOWS:
- Python 2.7
- Tarbell 1.0.*
- Node.js
- grunt-cli (See http://gruntjs.com/getting-started#installing-the-cli)
You should define the following keys in either the values
worksheet of the Tarbell spreadsheet or the DEFAULT_CONTEXT
setting in your tarbell_config.py
:
- p2p_slug
- headline
- seotitle
- seodescription
- keywords
- byline
Note that these will clobber any values set in P2P each time the project is republished.
This blueprint creates configuration to use Grunt to build front-end assets.
When you create a new Tarbell project using this blueprint with tarbell newproject
, you will be prompted about whether you want to use Sass to generate CSS and whether you want to use Browserify to bundle JavaScript from multiple files. Based on your input, the blueprint will generate a package.json
and Gruntfile.js
with the appropriate configuration.
After creating the project, run:
npm install
to install the build dependencies for our front-end assets.
When you run:
grunt
Grunt will compile sass/styles.scss
into css/styles.css
and bundle/minify js/src/app.js
into js/app.min.js
.
If you want to recompile as you develop, run:
grunt && grunt watch
This blueprint simply sets up the the build tools to generate styles.css
and js/app.min.js
, you'll have to explicitly update your templates to point to these generated files. The reason for this is to make you think about whether you're actually going to use an external CSS or JavaScript file and avoid a request for an empty file if you don't end up putting anything in your custom stylesheet or JavaScript file.
To add app.min.js
to your template file:
<script src="js/app.min.js"></script>