mozilla-services / mozilla-pipeline-schemas

Schemas for Mozilla's data ingestion pipeline and data lake outputs

Home Page:https://protosaur.dev/mps-deploys/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use everit-org/json-schema for schema validation tests

jklukas opened this issue · comments

The GCP pipeline is using the everit-org/json-schema Java library for schema validation and we should move testing of schemas to rely on that same library.

Currently, we test using hindsight (which I believe is using rapidjson for schema validation). We could run both testing implementations while we have both pipelines running concurrently.

Rapidjson's schema validator implements only v4 of the JSON schema specification and it ignores the "format" keyword. Moving completely to everit-org/json-schema would allow us to consider upgrading to the current v7 draft and taking better advantage of the "format" keyword for ensuring correct date and time types.

v7 also allows $comment fields, which would be useful for non-user-facing documention of schema templates.

The edge-validator needs to be updated and re-deployed as well: https://github.com/mozilla-services/edge-validator