sul-dlss / sdr-metrics-api

An API for tracking and querying usage metrics for Stanford SDR objects

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SDR Metrics API

CI

An API for tracking and querying usage metrics for Stanford SDR objects.

Requirements

  • Ruby 3.2
  • A database (SQLite locally, Postgres in production)

Setup

Run the rails setup script to install dependencies and set up the database:

bin/setup

Developing

Start a development server:

bin/rails server

Usage

Tracking metrics

Event tracking is done via Ahoy's built-in API. Clients should use the ahoy.js library, which handles submitting POST requests to the API automatically.

Event submissions should include the DRUID of the object being tracked, which will be used to associate the event with the object. Other attributes can be included as needed.

Each time an event is logged it is associated with a visit that identifies characteristics of the device being used, including its user agent and masked IP address. Visits are created automatically by ahoy.js.

Views

To track views of an object, use the trackView() method, which creates an event with the built-in type $view:

ahoy.trackView({ druid: "py305sy7961" });

Downloads

Downloads are tracked by creating an event with the type download:

ahoy.track("download", { druid: "py305sy7961" });

You can also specify a particular file being downloaded:

ahoy.track("download", { druid: "py305sy7961", file: "file_1.pdf" });

Querying metrics

Metrics can be queried by a given object's DRUID:

curl http://localhost:3000/py305sy7961/metrics

The response is a single JSON object with total counts for each tracked event type.

{
  "views": 2340,
  "downloads": 223,
  "unique_views": 993,
  "unique_downloads": 220
}

Counts labeled "unique" are deduplicated by visit, so that multiple events coming from the same device in a short time period are only counted once. This time period is configurable when initializing ahoy.js.

Testing

Code is linted with Rubocop and tested with RSpec on each push to GitHub. You can run everything locally with:

bin/rake

For just the linter, or just the tests:

bin/rake rubocop
bin/rake spec

Deploying

The application is deployed automatically by Jenkins (sul-ci).

Merging to main will trigger a staging deploy, and creating a github release with a v tag will trigger a production deploy.

To deploy manually, you can use capistrano:

cap stage deploy
cap prod deploy

Design

Why not use Google Analytics?

Google Analytics includes a lot of features that we don't need, especially around marketing and advertising. It also doesn't record events in a useful way when triggered from an embedded iframe like we do in sul-embed. The "enhanced measurement" feature in GA4 captures some events, but not all, so we'd need to do some work to manually trigger the ones we want. And of course, Google can change the API at any time, which would break our integration.

Why not use another hosted analytics service?

Keeping the analytics tracking first-party means we have control over the data and can ensure it's not shared with third parties, as well as make guarantees about anonymization and retention. We also don't have to worry about a third-party service going out of business or changing their pricing model.

Why not make metrics tracking a part of PURL?

We need a database in which to store tracked metrics, but PURL was designed as a static site that serves XML from the filesystem. Rather than adding this database to PURL, we decided to create a separate service that can be used by other applications as well.

About

An API for tracking and querying usage metrics for Stanford SDR objects


Languages

Language:Ruby 94.9%Language:Dockerfile 4.6%Language:Shell 0.5%