py-pears: unofficial PEARS development kit

py-pears was developed to consolidate Illinois Extension's reporting and data cleaning infrastructure for PEARS into a single Python package.

Features

schedule.py serves as the entry point for a job scheduler. Operation and recommendations are detailed in Schedule.
The utils.py module compiles methods shared across multiple scripts to streamline report development and maintenance.
A brief summary of each report is provided in the Reports section.
Several modules are provided to facilitate automated testing of PEARS reports. See Testing below for more information.

Installation

The recommended way to install py-pears is through git, which can be downloaded here. Once downloaded, run the following command:

git clone https://github.com/jstadni2/py-pears

This package uses Poetry for dependency management, so follow the installation instructions given in the previous link. Once installed, run the following command in the root directory of the package:

poetry install

Setup

A JSON file of organizational settings is required to utilize py-pears. Create a file named org_settings.json in the /py_pears directory.

Example org_settings.json:

{
  "aws_profile": "your_profile_name",
  "s3_organization": "org_prefix",
  "admin_username": "your_username@domain.com",
  "admin_password": "your_password",
  "admin_send_from": "your_username@domain.com",
  "staff_list": "/path/to/Staff_List.xlsx",
  "pears_prev_year": "/path/to/annual_pears_exports/2022/",
  "coalition_survey_exports": "/path/to/coalition_survey_exports/2022/"
}

This package's .gitignore file will exclude org_settings.json from git commits. Follow the instructions below for obtaining necessary credentials.

Amazon Web Services

An AWS named profile will need to be created for accessing automated PEARS exports from the organization's AWS S3 bucket.

Contact PEARS support to set up an AWS S3 bucket to store automated PEARS exports.
Obtain the key, secret, and organization's S3 prefix from PEARS support.
Install AWS CLI.
Use AWS CLI to create a named profile for the PEARS S3 credentials using the following command:

aws configure --profile your_profile_name

Set the value of "aws_profile" to the name of the profile in org_settings.json.
Set the value of "s3_organization" to the S3 prefix obtained from PEARS support.

Email Credentials

Administrative credentials are required for email delivery of reports and PEARS user notifications.

Set the "admin_username" and "admin_password" variables in org_settings.json to valid Office 365 credentials.
The "admin_send_from" variable can be optionally set to a different address linked to "admin_username". Otherwise, assign the same value to both variables.
The send_mail() function in utils.py is defined using Office 365 as the host. Change the host to the appropriate email service provider if necessary.

External Data

The following file/directory paths are required to run some reports in py-pears.

"staff_list": The path to a workbook that compiles organizational staff.
- See FY23_INEP_Staff_List.xlsx as an example.
- Reports dependent on "staff_list" may require additional alterations depending on the staff list format.
- If your organization actively maintains its staff list internally in PEARS, the User_Export.xlsx workbook could be used in lieu of external staff lists.
"pears_prev_year": The path to a directory of the previous report year's PEARS exports for each module.
- This may not be necessary if your organization does not intent to use the Partnerships Entry Report
"coalition_survey_exports": The path to a directory of PEARS Coalition Survey exports.
- This may not be necessary if your organization does not intent to use the Coalition Survey Cleaning Report

Schedule

The run dates, input and output directories, and email recipients for each report are set in schedule.py. Scheduled dates are compared to a timestamp before importing PEARS data from the AWS S3 and running the report. To run the schedule, execute the following system command within the package directory:

poetry run schedule

Trigger dates for your organization's job scheduler should mirror the run dates set in schedule.py.

Reports

Monthly Data Cleaning

The Monthly Data Cleaning script flags records based on guidance provided to PEARS users by the Illinois SNAP-Ed implementing agency. Users are notified via email how to update their flagged records.

Staff Report

The Staff Report summarizes the PEARS activity of SNAP-Ed staff on a monthly basis. Separate reports are generated for each Illinois SNAP-Ed implementing agency, Illinois Extension and Chicago Partnership for Health Promotion (CPHP).

Quarterly Program Evaluation Report

The Quarterly Program Evaluation Report generates metrics for Illinois Extension's quarterly SNAP-Ed evaluation report. Data from PEARS is used to calculate evaluation metrics specified by the SNAP-Ed Evaluation Framework and Illinois Department of Human Services (IDHS).

Sites Report

The Sites Report compiles the site records created in PEARS by Illinois Extension staff on a monthly basis. In order to prevent site duplication, select staff are authorized to manage requests for new site records. Other users are notified when they enter sites into PEARS without permission.

Partnerships Entry Report

The Partnerships Entry Report generates Partnerships to enter for the current report year. Program Activity and Indirect Activity records are cross-referenced with existing Partnerships to create new Partnership or copy-forward records from the previous report year. Separate reports are generated for each Illinois SNAP-Ed implementing agency, Illinois Extension and CPHP.

Coalition Survey Cleaning

The Coalition Survey Cleaning script flags Coalition records if a corresponding Coalition Survey is not submitted for the previous quarter. Users are notified via email how to submit a survey for the applicable Coalitions.

Partnerships Intervention Type Report

Testing

Since the schema of PEARS export workbooks changes periodically, py-pears includes several modules to enable automated testing of exports and report outputs.

Test PEARS

The Test PEARS test suites determine whether expected PEARS exports are present on the AWS S3 using the pytest framework. The schema of the current export workbooks are also compared to those found in /tests/test_inputs.

Execute the following command from the root directory of the package to run test_reports.py:

poetry run pytest tests/test_pears.py

Alternatively, you can simply run all test suites via:

poetry run pytest

Generate Test Inputs

The Generate Test Inputs script downloads PEARS exports from the current day's AWS S3 subdirectory to /tests/test_inputs. Identifying information for users, sites, and partnering organizations is replaced with data generated from the Faker Python package. A copy of the "staff_list" Excel workbook specified in org_settings.json is populated with fake users. Fields used for Illinois Extension's program evaluation are also replaced with random numeric values. Once schema changes in PEARS export works are discovered and report scripts are updated accordingly, rerun generate_test_inputs.py and the subsequent modules and test suites.

Execute the following command to run generate_test_inputs.py:

poetry run generate_test_inputs

Generate Expected Outputs

The Generate Expected Outputs script runs reports with data produced by generate_test_inputs.py. The resulting Excel workbooks are stored in /tests/actual_outputs for use in the Test Reports test suites.

Execute the following command to run generate_expected_outputs.py:

poetry run generate_expected_outputs

Test Reports

The Test Reports test suites compare report outputs with the Excel workbooks generated from generate_expected_outputs.py. Any report output alterations introduced during refactoring are detailed in diff Excel workbooks exported to /tests/actual_outputs.