OptimalBits / bull

Premium Queue package for handling distributed jobs and messages in NodeJS.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Help needed to use child processors in modern ES6 environment (require vs. import)

Twisterking opened this issue · comments

Hello everyone,

We use bull very heavily at Orderlion and absolutely love it. We now recently upgraded our worker AWS instance to a more powerful server and would now love to use the child processes setup that bull supports to really properly use our 4 CPU cores.

Until now, we just used the typical signature to process a job like this:

myQueue.process(concurrency: number, processor: ((job) => Promise<any>))

That worked beautifully! The function used to process the job of course needs all sorts of imports of all sorts of ES6 packages and all that.

To use child processors, I now wanted to do something like this (as outlined here):

myQueue.process(concurrency: number, path.join(__dirname, 'mySeparateProcessor.js'))

Inside of mySeparateProcessor.js I wanted to do something like this:

import foo from 'bar';
// all my other imports
// there are a lot!
// there are also imports from in-house and external ES6 only packages!

export default async function (job) => Promise

I got the idea to do it like this also from here: #923
The code in the OP of this issue gave me the impression that something like this should work, but apparently it does not - at least for me! It results in an error like this:

Exception while invoking method 'callOnWorker' Error: Error loading process file <path>/mySeparateProcessor.js. Must use import to load ES Module: <path>/mySeparateProcessor.js
require() of ES modules is not supported.
require() of <path>/mySeparateProcessor.js from <root-path>/node_modules/bull/lib/process/master.js is an ES module file as it is a .js file whose nearest parent package.json contains "type": "module" which defines all .js files in that package scope as ES modules.
Instead rename processImportQueueJob.js to end in .cjs, change the requiring code to use import(), or remove "type": "module" <root-path>/worker/package.json.

I now tried all sorts of things, like using .cjs, using require() instead of import (also fails because lots of my internal packages are also modules, so I can not require() them!), await import() ing my packages INSIDE the mySeparateProcessor.js function (resulting in another weird error: Error: AbortController is not defined at Queue.onFailed <root-path>/app/node_modules/bull/lib/job.js:516:18) ...

everything I try does not work! 😢

--> in a nutshell:
What is the modern way to use the great child processors setup in a modern ES6 environment?

We are still inside a bit of outdated node 14 environment, but will update to version 21 later on this year. But I don't think this is a node issue per-se, I just can't get the whole CJS vs. ES6 setup to work. :/

Really hope you can help, I am kinda stuck.

Sidenote: I also had a look into the docs of the bull-mq implementation and, to be honest, I am confused here too. Can an ES6 import statement AND a module.exports = statement inside the same .js file ever work, as outlined in the docs?

Thank you, best Patrick

Hello! Any updates on this? Any help is much appreciated. I am kinda stuck and really asking myself, how this setup should be used in a modern ES6 project.

Bump ... is nobody using this package anymore?! 😢

You are hit by the issues created by Node and the bad interoperability between Commonjs and ES6 modules. In Bull, we are not going to open the can of worms that is trying to solve this issue, as it is an old codebase and we are just fixing urgent bugs, not adding new functionality. I wrote a blog post a while ago on how I managed to use sandboxed processors on modern codebases (https://blog.taskforce.sh/using-typescript-with-bullmq/).
For your case, however, if what you are interested in is using the 4 cores, then I think it is better to just spawn 4 workers, you can use pm2 for example. In this way you will also be able to use a high concurrency setting if your jobs are IO bound, so for example you can have concurrency 200 per worker, times 4 you will get up to 800 jobs processed in parallel... that would be the simplest way to get a lot of value for your extra cores.

Thank you for your reply @manast ...
Indeed I am aware that this is not really a bull issue per-se, I was just asking for advice on how to make it work.

Since I am already running in a docker environment, it should be quite easy for me to spawn multiple identical workers!

Just to be sure: am I getting this right, that if I define a concurrency like this

myQueue.process(200, processor: ((job) => Promise<any>))

that, if I spawn then 4 identical docker containers, the ACTUAL concurrency will be 4*200 = 800, correct? So the defined concurrency is "per worker instance" and NOT "global", correct?

Yes, concurrency is per instance not global.

Thank you!