Support higher-volume cases (and potentially an ordering guarantee) by using pgq

Question

Support higher-volume cases (and potentially an ordering guarantee) by using pgq

ePaul opened this issue 4 years ago · comments

Background

We currently inside Zalando have a discussion of how to implement reliable (transactional) event sending, which is basically what this library is trying to do.

When I mentioned this library (and that we are using similar approaches in another team, where we do a nightly full vacuum), it was pointed (by @CyberDem0n):

That's actually the major problem of such homegrown solutions.

Write amplification (you are not only inserting into the queue table, but also updating/deleting).

Permanent table and index bloat due to the 1.

Regular heavy maintenance required due to the 2.

Maintenance always affects normal processes interacting with the events table.

In case if the event flow is relatively high, it quickly becomes not enough to do vacuum full/reindex only once a night.

In this regard pgq is maintenance free. For every queue you create, under the hood it creates a few tables.
These tables are INSERT ONLY, therefore they are explicitly excluded from the autovacuum.
Tables are used in the round-robin matter. Since events are always processed strictly in one order it is enough only to keep the pointer to the latest row(event) that was processed and no UPDATES/DELETES required on the event table. Once all events from the specific table are processed PgQ simply does TRUNCATE on this table.
These tricks are making PgQ very scalable. Back 10 years ago, when PostgreSQL didn't yet have built-in streaming replication, the PgQ was used as a base for the logical replication, Londiste. Both solutions are developed by Mark Kreen while working for Skype. IIRC, 3 or 4 years ago Skype was still relying on PgQ and Londiste, because they just work.

@a1exsh pointed me to the pgq SQL API and promised to help with code review if we want to integrate this into this library.

Goal

Find a way of using a pgq queue instead of the current event_log table for storing the events for later Nakadi submission.
This should be optional, as not every user of this library has pgq available, or the ability to install postgresql extensions.

Paŭlo Ebermann · Answer 1 · Tue Aug 04 2020 01:55:07 GMT+0800 (China Standard Time)

Implementation ideas/concerns

Abstracting queue access

This possibly can be done mostly by providing just a different implementation of the EventLogRepository interface – possibly with some adaptions (including the using code):

find/lock could be merged to return a batch
- table: update + select
- PGQ: next_batch_info / next_batch + get_batch_events or get_batch_cursor
delete could delete a whole batch (but except those which failed submission, see below)
- table: delete
- PGQ: finish_batch
persist for inserting an event (I think we don't use this for update, do we?)
- table: insert
- PGQ: insert_event

→ PGQ API documentation

Strict ordering vs. retrying

We might need some separate functionality for events which failed their submission:

In the table-based implementation we have right now, we just don't delete such events, so when the locking time expires, they will be automatically locked again by the lock, and then tried again.
In the PQG version, we use pgq_event_retry. (This can allow us to implement an exponential back-off.)

I'm not sure how this can be abstracted in a useful way.

Consumption

We need to make sure to register a consumer before we start inserting events (consumers only get events inserted after they subscribed).
We should only have a single consumer, so we don't send out the same event multiple times.
It's not quite clear whether we can have multiple threads/application instances fetching events in parallel from the queue, using the same consumer id.
- If so, we likely lose any ordering guarantees which might be there (but our library doesn't provide ordering guarantees anyways).
- If not, we need to figure out how to have just a single consuming thread/ application instance active. A proposal was to have a separate single-pod deployment just for this purpose, but I'm not sure that fits well with the "just plug it in" philosophy of this library.

Configuration

The spring-boot autoconfiguration could use some property to decide whether to use pgq.
We could also try to auto-detect the existence of the extension. We could also (optionally) depend on https://github.com/BrandwatchLtd/pgq-consumer, and use the existence of that in the class path as the decider?
We might want to have the queue name configurable, so multiple event queues (from different components
using the same DB) could coexist (and so we don't conflict with other queues the application might have).

a1exsh · Answer 2 · Tue Aug 04 2020 18:24:47 GMT+0800 (China Standard Time)

I think it makes sense to consider the task of consuming from a PGQ queue and publishing batches of events to Nakadi separately from the task of producing events for the queue in the first place. This way the former component may be reused more widely, e.g. when the application producing the events isn't using Spring (or Java, for that matter).

This library can still provide building blocks for both components, but it could be done with the potential separate deployment in mind.

Paŭlo Ebermann · Answer 3 · Wed Aug 05 2020 00:49:00 GMT+0800 (China Standard Time)

@a1exsh This is certainly something to consider. I just worry that having separate deployments will increase the configuration overhead (i.e. you can not just plug in this library in your application and everything works), but maybe it's the way to go.
Maybe we can find a way to have both the easy setup, and an option for separating stuff for cases where it's needed.

If other libraries (or applications implementing this manually) will want to interoperate, we also need to make sure to specify the format of the events in the queue, which limits our ability to evolve this in the future.