Add support for dry-runs to `weather-mv`.
alxmrs opened this issue · comments
Alex Merose commented
A helpful step towards fixing #21.
ticket updated: 2022-03-24
As a weather tools user, I would like to be able to preview the effect of each tool before incurring the cost of data movement and infrastructure. These light-weight previews will help me test pipelines before deployment, lowering the number of iterations needed to set up a data pipeline. For this issue, I want to be able to perform dry runs with the weather mover.
Acceptance Criteria
- Provide a common interface for exercising dry runs for every Data Sink
- When a user passes the
-d
or--dry-run
flag to theweather-mv
cli, this feature will be activated. - When a user checks tool documentation (the README or CLI help message), they will have a good understanding of what the feature does
- As a user, I will still have some way to monitor the execution flow of the tool during a dry run
- Log messages from non-dry runs will remain the same as those within a dry-run
- As a user, I can inspect the kinds of messages that would have been written to BigQuery.
- (optional) Maybe more log messages are needed to see what's happening?
- As a user, I can execute dry runs locally or remotely on Dataflow
- Where appropriate, data is simulated in memory. No data is written to disk or cloud storage during a dry run.
- As a user, I still would like to validate execution on actual user-suppled URIs.
- Where there are contradictions in requirements, the ergonomic option for
weather-mv
users is preferred. - All code should be completely covered by tests.
Implementation Notes
- The best place to provide a common interface for dry runs is
- Most code changes will happen in this class