google / weather-tools

Tools to make weather data accessible and useful.

Home Page:https://weather-tools.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for dry-runs to `weather-mv`.

alxmrs opened this issue · comments

A helpful step towards fixing #21.

ticket updated: 2022-03-24

As a weather tools user, I would like to be able to preview the effect of each tool before incurring the cost of data movement and infrastructure. These light-weight previews will help me test pipelines before deployment, lowering the number of iterations needed to set up a data pipeline. For this issue, I want to be able to perform dry runs with the weather mover.

Acceptance Criteria

  • Provide a common interface for exercising dry runs for every Data Sink
  • When a user passes the -d or --dry-run flag to the weather-mv cli, this feature will be activated.
  • When a user checks tool documentation (the README or CLI help message), they will have a good understanding of what the feature does
  • As a user, I will still have some way to monitor the execution flow of the tool during a dry run
    • Log messages from non-dry runs will remain the same as those within a dry-run
    • As a user, I can inspect the kinds of messages that would have been written to BigQuery.
    • (optional) Maybe more log messages are needed to see what's happening?
  • As a user, I can execute dry runs locally or remotely on Dataflow
  • Where appropriate, data is simulated in memory. No data is written to disk or cloud storage during a dry run.
    • As a user, I still would like to validate execution on actual user-suppled URIs.
  • Where there are contradictions in requirements, the ergonomic option for weather-mv users is preferred.
  • All code should be completely covered by tests.

Implementation Notes