py-pears
was developed to consolidate Illinois Extension's reporting and data
cleaning infrastructure for PEARS into a single Python package.
- schedule.py serves as the entry point for a job scheduler. Operation and recommendations are detailed in Schedule.
- The utils.py module compiles methods shared across multiple scripts to streamline report development and maintenance.
- A brief summary of each report is provided in the Reports section.
- Several modules are provided to facilitate automated testing of PEARS reports. See Testing below for more information.
The recommended way to install py-pears
is through git, which can be downloaded
here. Once downloaded, run the following command:
git clone https://github.com/jstadni2/py-pears
This package uses Poetry for dependency management, so follow the installation instructions given in the previous link. Once installed, run the following command in the root directory of the package:
poetry install
A JSON file of organizational settings is required to utilize py-pears
. Create a file named org_settings.json
in the /py_pears directory.
Example org_settings.json
:
{
"aws_profile": "your_profile_name",
"s3_organization": "org_prefix",
"admin_username": "your_username@domain.com",
"admin_password": "your_password",
"admin_send_from": "your_username@domain.com",
"staff_list": "/path/to/Staff_List.xlsx",
"pears_prev_year": "/path/to/annual_pears_exports/2022/",
"coalition_survey_exports": "/path/to/coalition_survey_exports/2022/"
}
This package's .gitignore file will exclude
org_settings.json
from git commits. Follow the instructions below for obtaining necessary credentials.
An AWS named profile will need to be created for accessing automated PEARS exports from the organization's AWS S3 bucket.
- Contact PEARS support to set up an AWS S3 bucket to store automated PEARS exports.
- Obtain the key, secret, and organization's S3 prefix from PEARS support.
- Install AWS CLI.
- Use AWS CLI to create a named profile for the PEARS S3 credentials using the following command:
aws configure --profile your_profile_name
- Set the value of
"aws_profile"
to the name of the profile inorg_settings.json
. - Set the value of
"s3_organization"
to the S3 prefix obtained from PEARS support.
Administrative credentials are required for email delivery of reports and PEARS user notifications.
- Set the
"admin_username"
and"admin_password"
variables inorg_settings.json
to valid Office 365 credentials. - The
"admin_send_from"
variable can be optionally set to a different address linked to"admin_username"
. Otherwise, assign the same value to both variables. - The
send_mail()
function in utils.py is defined using Office 365 as the host. Change the host to the appropriate email service provider if necessary.
The following file/directory paths are required to run some reports in py-pears
.
"staff_list"
: The path to a workbook that compiles organizational staff.- See FY23_INEP_Staff_List.xlsx as an example.
- Reports dependent on
"staff_list"
may require additional alterations depending on the staff list format. - If your organization actively maintains its staff list internally in PEARS, the User_Export.xlsx workbook could be used in lieu of external staff lists.
"pears_prev_year"
: The path to a directory of the previous report year's PEARS exports for each module.- This may not be necessary if your organization does not intent to use the Partnerships Entry Report
"coalition_survey_exports"
: The path to a directory of PEARS Coalition Survey exports.- This may not be necessary if your organization does not intent to use the Coalition Survey Cleaning Report
The run dates, input and output directories, and email recipients for each report are set in schedule.py. Scheduled dates are compared to a timestamp before importing PEARS data from the AWS S3 and running the report. To run the schedule, execute the following system command within the package directory:
poetry run schedule
Trigger dates for your organization's job scheduler should mirror the run dates set in schedule.py
.
The Monthly Data Cleaning script flags records based on guidance provided to PEARS users by the Illinois SNAP-Ed implementing agency. Users are notified via email how to update their flagged records.
The Staff Report summarizes the PEARS activity of SNAP-Ed staff on a monthly basis. Separate reports are generated for each Illinois SNAP-Ed implementing agency, Illinois Extension and Chicago Partnership for Health Promotion (CPHP).
The Quarterly Program Evaluation Report generates metrics for Illinois Extension's quarterly SNAP-Ed evaluation report. Data from PEARS is used to calculate evaluation metrics specified by the SNAP-Ed Evaluation Framework and Illinois Department of Human Services (IDHS).
The Sites Report compiles the site records created in PEARS by Illinois Extension staff on a monthly basis. In order to prevent site duplication, select staff are authorized to manage requests for new site records. Other users are notified when they enter sites into PEARS without permission.
The Partnerships Entry Report generates Partnerships to enter for the current report year. Program Activity and Indirect Activity records are cross-referenced with existing Partnerships to create new Partnership or copy-forward records from the previous report year. Separate reports are generated for each Illinois SNAP-Ed implementing agency, Illinois Extension and CPHP.
The Coalition Survey Cleaning script flags Coalition records if a corresponding Coalition Survey is not submitted for the previous quarter. Users are notified via email how to submit a survey for the applicable Coalitions.
Since the schema of PEARS export workbooks changes periodically, py-pears
includes several modules to enable
automated testing of exports and report outputs.
The Test PEARS test suites determine whether expected PEARS exports are present on the AWS S3 using the pytest framework. The schema of the current export workbooks are also compared to those found in /tests/test_inputs.
Execute the following command from the root directory of the package to run test_reports.py
:
poetry run pytest tests/test_pears.py
Alternatively, you can simply run all test suites via:
poetry run pytest
The Generate Test Inputs script
downloads PEARS exports from the current day's AWS S3 subdirectory to
/tests/test_inputs. Identifying information for users, sites, and
partnering organizations is replaced with data generated from the Faker
Python package. A copy of the "staff_list"
Excel workbook specified in org_settings.json
is populated with fake
users. Fields used for Illinois Extension's program evaluation are also replaced with random numeric values. Once
schema changes in PEARS export works are discovered and report scripts are updated accordingly, rerun
generate_test_inputs.py
and the subsequent modules and test suites.
Execute the following command to run generate_test_inputs.py
:
poetry run generate_test_inputs
The Generate Expected Outputs
script runs reports with data produced by generate_test_inputs.py
. The resulting Excel workbooks are stored in
/tests/actual_outputs for use in the
Test Reports test suites.
Execute the following command to run generate_expected_outputs.py
:
poetry run generate_expected_outputs
The Test Reports test suites compare report
outputs with the Excel workbooks generated from generate_expected_outputs.py
. Any report output alterations introduced during refactoring are
detailed in diff Excel workbooks exported to
/tests/actual_outputs.
Execute the following command from the root directory of the package to run test_reports.py
:
poetry run pytest tests/test_reports.py
This project was generated with wemake-python-package
. Current template version is: ffbf87a961dab34c346b27d0d8468fc90c215646. See what is updated since then.