dashbitco / broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

Home Page:https://elixir-broadway.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Disable automatic call to handle_batch/4

rogerweb opened this issue · comments

Hi,

Would it be possible to prevent Broadway from automatically calling handle_batch/4 and let the user do it when ready? Even better if it could take only the message_id instead of the whole message (not sure Broadway or the producer keeps the message to itself somewhere).

The problem I'm trying to solve is: I'm consuming messages from SQS and delivering each one to the respective target user via his own Phoenix channel (user:123), but I would like to be sure the message got delivered before acknowledging it (i.e., removing it from SQS). Since Phoenix's broadcast/3 return value is not an actual "ack", I'm planning to make the browser to push an "ack" message back to the channel, with the Broadway's message_id. However, how can I make Broadway to wait for it (or timeout) from the handle_message/3?

I think in your case, then you don't want to use Broadway's buitin ack at all. You can disable it and then ack from the channel once you receive the client message. You can set the ack to noop here: https://hexdocs.pm/broadway_sqs/BroadwaySQS.Producer.html#module-acknowledgments

Thanks @josevalim for the quick reply. I managed to do it the naive way, a single GenServer accumulates the receipts from the clients and uses ExAws to delete the messages once the batch size is reached or times out. Now I'm trying to figure out how many processes should I start and how to distribute the receipts to them. Do you think I should have one process per partition? Should I try to leverage any existing module from Broadway?