Betterment / delayed

a multi-threaded, SQL-driven ActiveJob backend used at Betterment to process millions of background jobs per day

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Open to PRs?

johnnyshields opened this issue · comments

Hi there, I've raised several PRs to DelayedJob which aren't getting merged :(

I'm wondering if the owners of this repo are open to me porting some of the work here:

  • Support for worker pools (in addition to queues)
  • Multi-process support (without Daemons gem -- see collectiveidea/delayed_job#1138)
  • Copy-on-write forking, to improve memory with multi-process

Let me know and I'll start raising PRs soon.

Hi Johnny! Welcome!

We can't always promise an immediate turnaround, but we're absolutely open to PRs! We were actually already considering introducing a Puma-style forking model, so consider my interest piqued 😄.

Given that it's not exactly a small addition, I'd love to dig in a little more and—ideally before you do a bunch of work (because I hear you about having PRs go unmerged 🥲)—discuss the proposed design/implementation, the use cases that it targets, and get your general thoughts on how to align the feature with the philosophy of this gem.

So to start the conversation, I'd like to share a few reasons why we didn't go with a forking model originally (when we added the existing concurrency support). Here are three in no particular order (expand them for more context):

1. The job claim/pickup query is usually the bottleneck.

We determined that the job claim/pickup query (and the strain it places on a DB with a high-throughput jobs table) is one of most common bottlenecks in determining an upper bound for worker counts, and so when exploring our options for concurrency, we wanted to avoid an implementation that would result in additional/concurrent pickup queries. That's why we first went with a multithreading approach (via our Worker.max_claims config, which allows workers to claim jobs in bulk and then run them in concurrent threads), because it allowed us to increase concurrency on job execution (within each worker) without introducing any additional pickup queries (because worker count can remain the same).

2. We observed a lack of CPU-bound jobs. (Most jobs are DB- or network- bound.)

Because of the GIL, a single Ruby process can only utilize one CPU core (no matter how many threads it spins out), so it seems like a forking model would be an obvious performance win. However, we generally weren't seeing very many CPU-bound jobs that would benefit from workers utilizing multiple cores. So this further lowered the priority of a forking approach for us. Obviously, this is going to vary from app to app, but knowing that Ractors are on their way as a potential path forward, we decided to keep things simpler and avoid introducing...

3. The need for process-management capabilities.

A forking process manager requires a whole new layer of process-management concerns (i.e. what should be done when a child process dies, or uses too much memory, or becomes otherwise unresponsive, etc). Thus far, we've avoided baking in our own process manager, and instead have been able to rely on our deployment infrastructure for dealing with process management/scaling (we use a kubernetes-backed platform, so it works well, but Heroku dynos work similarly). And so the philosophy of this gem has been to follow "The Process Model", meaning that some outside process manager (k8s, heroku, systemd, etc) should be in charge of scaling out the processes (in containers resourced to fit their needs), and restarting crashed or unhealthy processes (and notifying us, or logging the occurrence, accordingly).

Given the above (particularly point number 3), we decided to remove Delayed::Command in favor of a single entrypoint: rake delayed:work (aliased as rake jobs:work). So we have already diverged a fair bit from the ancestor repos. That doesn't mean we can't add forking/pooling to the rake task itsef, but I don't think you can port your changes over 1:1.

I'd also like to think more about what Puma-style forking could get us that we couldn't already get by scaling out additional K8s pods (or dynos, etc). I think there's a great case for better utilizing all of the resources available to larger containers -- e.g. if you've provisioned very large Dynos for your workers, a Puma-like forking model might help you make the best use of those resources. (Is that the kind of use case you had in mind for this?) Would love to get your thoughts on all of this! 🙏

Hi thanks for the response! There's a lot of ground to cover here...

First more background, in terms of Puma, we run on K8s and using Puma we like to be able to tune both threads and processes. We run large K8s pods, e.g. 16 cores with 1 process per core, 5-20 threads per process. AFAIK the GIL prevents Ruby from effectively using multiple cores in a multi-threaded app, so you really need both multi-process and multi-core.

I've run my multi-process fork of Delayed Job in production at TableCheck for 6+ months; we've served billions of jobs each month so its demostrably robust/stable.

To respond to your points:

  1. The job claim/pickup query is usually the bottleneck

I use MongoDB and I've managed to optimize this. The problem I observed when using many different queues was "whiffing" on queries, so for example, I'd have 1,000,000 jobs from queue A at-rest, and a worker queue B scans all 1,000,000 for any job B items and doesn't find any. The trick here was to shard my different job queues into different database tables. (*By the way, I'd need to add MongoDB support to your repo.)

  1. We observed a lack of CPU-bound jobs. (Most jobs are DB- or network- bound.)

I have many Ruby-heavy CPU-bound computation jobs.

  1. The need for process-management capabilities

Not needed, we can live without this in the first interation. Puma implements inter-process communication (IPC) via UNIX pipes and I started to port it to DelayedJob its pretty nifty.

We decided to remove Delayed::Command in favor of a single entrypoint rake delayed:work

I was going to propose to add this back :) If we get into process management etc. think you'd want to do it without rake, but can of course keep Rake. This is another thing I refactored in Puma (not merged).

Ah, makes sense. So your aim is to get as much as you can out of fewer, more heavily-resourced pods, and the pickup query is perhaps less of a bottleneck for you given your datastore & table sharding strategy. And given your CPU-bound jobs, you do benefit from utilizing multiple cores, beyond just the improved-on-net memory efficiency. Out of curiosity, in a perfect world where Ractors are stable everywhere, would you consider going back to a non-forking model?

*By the way, I'd need to add MongoDB support to your repo.

As it stands, we're not looking to support anything other than SQL RDBMs (and ActiveRecord). More broadly we'd like to keep this gem focused around workloads that we can test and maintain in-house, so at least for now that will probably mean a much stronger focus on non-compute-bound workloads, which deemphasizes the need for Puma-style forking. Our current aim is really to only introduce as much complexity as is necessary to meet our own scale needs, so I'm having trouble reconciling our goals for this project with the additions you're proposing. 🙁

That's not to say that we don't want to support multi-core worker processes (I think, at least eventually, we do), but your need for MongoDB support would be a non-starter for us at this time. I hope that makes sense.

In a perfect world where Ractors are stable everywhere, would you consider going back to a non-forking model?

Yes, if we had Ractors everywhere, then theoretically one Ruby process would be able to use multiple cores (one core per Ractor), and each Ractor having its own intepreter lock and multiple threads. We will have to benchmark to see if multi-process might still provide better perfomance. It is reasonable to consider multi-process as a stepping stone to multi-Ractor. It would be good to see Puma or other major server adopt Ractors first, then I could port it.

MongoDB support would be a non-starter for us at this time

It's actually easy to do. All the MongoDB (Mongoid) logic can be contained in one model file that has the same interface as the ActiveRecord.

I'm a maintainer of the current DelayedJob MongoDB plugin (as well as 20+ other MongoDB-related Ruby libs) and am happy to support the MongoDB code here. If you won't accept MongoDB PR and I can't find some way of patching the library, it won't be worth my time to contribute here. I may have to fork and release separately :(

I hear you on how compatible the interface is! Unfortunately, we removed the multi-backend config, and bringing it back does not align with the goals of this project. At the very least because we're looking to extend/change that very interface in the future, and we won't be able to support tech that we don't use in house, or vouch for its alignment with the guarantees we'd like to make.

We're also reshaping this gem's role primarily as a backend for ActiveJob, and will likely continue to remove features that aren't in support of that. (We want to make it easy to reach for the ActiveJob queue adapter that is right for the job, without any additional layers of configuration.)

When we did the work to pull in delayed_job_active_record, I did very much see and appreciate the parallel work that had gone into delayed_job_mongoid, and I could see your proposed improvements becoming part of an all-in-one package similar to delayed, either ActiveJob-focused (like delayed) or standalone. Given that there are very few backends for delayed_job that are actively maintained these days, and combined with ActiveJob's new role in the ecosystem, I question whether we can't shed the multi-backend functionality that DJ was originally built with (almost 15 years ago now). Perhaps a future major release of delayed_job_mongoid could merge in the rest of delayed_job's functionality and drop it as a dependency? If you look at our commit history, we actually merged the two repos and retained the commit history from both (for posterity, and to give proper credit to the decade+ of work that went into them).

We're also reshaping this gem's role primarily as a backend for ActiveJob

ActiveJob is backend agnostic. The decision to use ActiveJob's job interface is not related to using ActiveRecord vs. Mongoid for persistence.

Perhaps a future major release of delayed_job_mongoid could merge in the rest of delayed_job's functionality and drop it as a dependency?

This doesn't make sense. DelayedJob's functionality is very heavy, DelayedJob Mongoid is ~100 lines of code.

As a compromise, would you accept if I make a delayed_mongoid gem which is pinned to a certain version of Betterment/delayed? The ActiveRecord code can continue to live in the main repo, and all you'd need to do is let me make a simple config interface for pluggable backend. You are then free to refactor your code as you like, break interfaces, etc. and I will periodically raise PRs if anything breaks and upgrade my delayed_mongoid gem to work with the latest Betterment/delayed version. Sound good?

👋 I hear you, I really do. The Ruby background job ecosystem is something that I care a lot about, and I want to see continued development and iteration in this space, so I'm going to do my best to explain why I don't see this particular proposal working out for us in the long run.

The decision to use ActiveJob's job interface is not related to using ActiveRecord vs. Mongoid for persistence.

As a user of ActiveJob, the decision of which queue adapter to use absolutely relates to the datastore that will back it. For example, setting queue_adapter to :sidekiq or :resque means you will be using Redis, and this will inform the way that you then go about enqueuing jobs (e.g. with after_commit hooks) and managing your job queue (e.g. needing a dead letter queue, since Redis doesn't have the same row-locking semantics as SQL and jobs are instead dequeued when they are picked up).

As such, our focus with this gem is on providing a very targeted experience for folks who specifically want their queue adapter backed by their app's own SQL datastore, and we have provided examples of how this informs enqueuing & operating jobs as well as how to monitor the queue. The Continuous Monitoring feature (rake delayed:monitor) is also tightly coupled to ActiveRecord, as it will run aggregate queries at a regular interval. We have no plans to add non-ActiveRecord support to this feature.

This doesn't make sense. DelayedJob's functionality is very heavy, DelayedJob Mongoid is ~100 lines of code.

I completely understand that, regardless of your datastore, many of the pieces of delayed_job feel like an overlapping concern in any queueing system you'd want for your app. But that's the role of the delayed_job gem, and not delayed. We intentionally decided to move away from delayed_job's philosophy of sharing the generic queue concerns across multiple backends. Instead, delayed is a vertically-integrated, SQL-based adapter, full stop. And we knew going in that this would mean taking on the maintenance story for the more heavy functionality that delayed_job also provides.

As an example, resque is heavily inspired by delayed_job, yet takes on the full worker/queuing concerns instead of simply acting as another DJ backend. So that's where I was coming from with my suggestion. We have delayed, que, and good_job (and others) targeting flavors of SQL, plus sidekiq and resque (etc) targeting Redis, but I'm not yet aware of a queue adapter gem that specifically targets MongoDB in a fully-integrated fashion (potentially with MongoDB-focused performance improvements like that sharding strategy you mentioned). Seems like a gap in the ecosystem, and a potential opportunity for something new! 🤔

But if spinning out a standalone gem is not an appealing option for you, it makes me think that despite any challenges you've had in engaging with delayed_job's maintainers, that project is still by far the best fit in the ecosystem for slotting in a ~100-line MongoDB backend, because it's designed specifically to support that level of extensibility. I really do empathize with your desire to find an alternative, and you don't need our blessing to pin dependencies to this gem or fork it and patch it in ways that work for you. But for all of the reasons that we chose to produce this gem in the first place, I'm relatively certain that we can't provide you with the project maintenance story you're looking for.

@smudge

The decision of which queue adapter to use absolutely relates to the datastore that will back it. For example, setting queue_adapter to :sidekiq or :resque means you will be using Redis ...

MongoDB is not Redis or Redis-like. For all intents-and-purposes, MongoDB does everything that a SQL database does.

In addition, by design, the Mongoid gem has a 95% similar interface to ActiveRecord. delayed:monitor for example would trivial to add for Mongoid.

I completely understand why Redis would be out-of-scope, I understand well the limitations of Sidekiq/Redis, and I am not asking to have a Redis, Memcached, etc. queue adapter in this gem.

Would you be willing to have an open-mind if I submit a PR which creates a basic API for a pluggable "SQL/ActiveRecord-like" backend, and consider merging it if it doesn't significantly increase code complexity? The PR will not contain any MongoDB/Mongoid specific code.

I'd really like to collaborate on this gem and I feel I/my team at TableCheck can add a lot of value, given what we've already built on Delayed Job.

Hi Johnny, appreciate you dialing back some of the stronger typography in your initial version of your most recent comment. Please remember that Nathan and others have put a lot of thought into the philosophy behind delayed and the design decisions aren’t based on shallow misunderstandings of the differences between data stores, so if you want to engage on whether we should flex those decisions it’ll be important to work to understand them more deeply. If you read carefully you’ll note that Nathan is not making the argument that we chose not to support mongodb because mongodb is like redis.

Also just wanted to give you a heads up that Nathan is on vacation right now, and I don’t know how online he plans to be, but he may follow up with more thoughts.

@jmileham when the team forked DelayedJob, they copied in the ActiveRecord adapter Job class and removed ~50 lines of code related to pluggable backends. This removal specifically does not reduce the complexity of how the core of Delayed/DelayedJob works. They've simply gone from N adapters to 1 adapter.

Other improvements in Delayed over DelayedJob (e.g. multi-threading) are certainly great, but specifically the removal (or re-addition) of pluggable backends does not affect that good stuff.

As the structure has not fundamentally changed since DelayedJob, the work to add the pluggable backends code back is minimal. But fine if the team doesn't want to accept a PR for it, my team will just maintain our own fork.