Add single run option (interval:0)
pranspach opened this issue · comments
I'd like to be able to stream files from s3 in a single batch, instead of a persistent watch interval.
For context, I am launching an S3->ES logstash docker container from an Airflow DAG.
Adding an interval:0 case feels intuitive and satisfies my needs. Would a PR for this be appreciated?
public
def run(queue)
@current_thread = Thread.current
if @interval == 0
process_files(queue)
else
Stud.interval(@interval) do
process_files(queue)
end
end
end # def run
@pranspach I would definitely appreciate a PR that adds one-shot processing that closes down the input when it completes.
There will likely be a bit of extra complexity in handling interrupts and stop sequence, since the plugin currently uses Stud#stop!
to interrupt the Stud::Interval
. It may be simple enough to wrap the one-off execution in a Stud::Task
that we then Stud::Task#wait
on to get the stopping semantics without changing anything else.
Or, we can simply interrupt the Stud::Interval
after the first execution by sending Stud#stop!
Using a zero-interval may be overloading the parameter a bit -- my natural assumption when seeing an interval of 0 would be that the input would simply look for more as soon as it is done processing what was present last time. A negative interval would be a slightly clearer way of indicating "this is not normal; read the docs".
Alternatively, we could use a separate parameter, like watch_for_new_files => "false"
?
Thanks @yaauie . I liked the watch_for_new_files
parameter suggestion. Let me know if there are any additional changes/feedback/discussion. I'd love to have this functionality in the plugin.
@pranspach merged in #162 and released in v3.4.0 😄