saschanaz / firefox-public-data-report-etl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Firefox Public Data Report ETL

The Firefox Public Data project is a public facing website which tracks various metrics over time and helps the general public understand what kind of data is being tracked by Mozilla and how it is used.

This repository contains the code used to pull and process the data for the Hardware section of the report.

The website itself is generated by the Ensemble and Ensemble Transposer repositories.

Data

Hardware report

Hardware report job uses data from Main pings, pulled from main ping BigQuery table.

It produces weekly aggregates organized by various dimensions, which are stored in BigQuery and exported to S3 where they can be consumed by Transposer.

User Activity

User activity job exports a BigQuery table containing following metrics:

User Activity (fxhealth.json):

  • Monthly Active users (MAU) - number of clients that used Firefox in the past 28 days
  • Average daily usage hours - average daily use of a typical client from the past 7 days. Calculated by getting the average daily use for each client from the last week (on days they used), and then averaging across all clients
  • Average intensity - average daily intensity of use of a typical client from the past 7 days. Intensity shows how many days per week do users use the product
  • New profile rate - percentage of WAU (clients who used Firefox in the past 7 days) that are new clients (created profile that week)
  • Latest version ratio - percentage of WAU on the newest version (or newer) of Firefox (for that week). Note, Firefox updates are often released with different throttling rates (i.e. 10% of population in week 1, etc.).

Usage Behavior (webusage.json):

  • Top languages - percentage of WAU on each of the top 5 language setting (locale).
  • Has Add-on - Percentage of WAU with at least 1 user installed addon.
  • Top Add-ons - The top 10 most common user installed addons from the last 7 days.

You can run this job locally from a Docker container:

make build && \
docker run \
    -v PATH_TO_SERVICE_CREDENITIALS.json:/app/.credentials \
    -e GOOGLE_APPLICATION_CREDENTIALS=/app/.credentials \
    -e AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID \
    -e AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY \
    firefox-public-data-report-etl:0.1 \
    -m public_data_report.cli \
    user_activity \
    --bq_table moz-fx-data-shared-prod.analysis.public_data_report_user_activity \
    --s3_bucket telemetry-public-analysis-2 \
    --s3_path public-data-report-dev/user_activity

Development

Testing

To run the tests, ensure you have Docker installed. First build the container using:

make build

then run the tests with:

make test

About


Languages

Language:Python 95.4%Language:Dockerfile 1.7%Language:Shell 1.6%Language:Makefile 1.3%