Using async mutex across NodeJS processes

Question

Using async mutex across NodeJS processes

lucas-rudd opened this issue 5 years ago · comments

This is more of a question related to the way this package works, and any limitations it may have in relation to some specific use-cases.

I have written a command line utility which generates a vast amount of data, and then dumps the generated data into one large file.

I've noticed some race-condition issues where a process reads the file, and adds to it at the same time another file is reading/writing the same contents. This results in some data being dropped/overridden from the final file, as what is added in one process is not read into the other process when it is merging the contents that it has generated.

I'm attempting to use this library to add some read/write protection to the processes. I can also attempt to refactor the child processes into a single-process asynchronous task, but that will significantly slow down the operations, and requires a lot of extra work on my end.

However, this library seems to only work within one process, and the lock is not shared across all the children (which, makes sense), causing each child process to obtain its own lock.

I'm attempting to do the following.

// mutex.ts
export const mutex = new Mutex();

// child_process.ts
import { mutex } from './mutex';
import { v4 } from 'uuid';
import * as fs from 'fs';
import * as _ from 'lodash';
import * as yaml from 'js-yaml';

const definition = await generateData();
await mutex.runExclusive(() => {
   console.log('LOCK ACQUIRED FOR', uuid);
   if (fs.existsSync(baseFile)) {
      baseDefinition = yaml.safeLoad(fs.readFileSync(baseFile, 'utf8'));
   }
   baseDefinition = _.merge(baseDefinition, definition);
   fs.writeFileSync(baseFile, yaml.safeDump(baseDefinition));
   console.log('RELEASING LOCK FOR', uuid);
});

const moreData = await generateMoreData();
await mutex.runExclusive(async () => {
   console.log('SECOND LOCK ACQUIRED FOR', uuid);
   await moveData(moreData, baseFile);
   console.log('SECOND LOCK RELEASED FOR', uuid);
});

The generateData and generateMoreData methods take a little bit of time, so I do not want to keep the lock during that generation process (which is why the single threaded async/await method takes significantly more time to complete, and why I'm spawning multiple child processes). The moveData method does some additional read/write stuff on baseFile, so I'm simply attempting to acquire the lock before attempting to run the method.

When running this, I get some output that looks like this

LOCK ACQUIRED FOR 526ac716-b135-4a2d-986d-75bee856411b
LOCK ACQUIRED FOR 104073b7-6d7f-4ca1-8513-eec103877898
RELEASING LOCK FOR 526ac716-b135-4a2d-986d-75bee856411b
SECOND LOCK ACQUIRED FOR 526ac716-b135-4a2d-986d-75bee856411b
RELEASING LOCK FOR 104073b7-6d7f-4ca1-8513-eec103877898
SECOND LOCK ACQUIRED FOR 104073b7-6d7f-4ca1-8513-eec103877898
SECOND LOCK RELEASED FOR 526ac716-b135-4a2d-986d-75bee856411b
LOCK ACQUIRED FOR 155eb578-bc91-4bee-830f-bab2e1363bb8
SECOND LOCK RELEASED FOR 104073b7-6d7f-4ca1-8513-eec103877898
LOCK ACQUIRED FOR 8e6e08cf-8b3b-4040-af88-86670ad5cbb1
RELEASING LOCK FOR 155eb578-bc91-4bee-830f-bab2e1363bb8
SECOND LOCK ACQUIRED FOR 155eb578-bc91-4bee-830f-bab2e1363bb8
LOCK ACQUIRED FOR cb79f948-e4a4-4f75-ba31-fa2b743ce886
RELEASING LOCK FOR 8e6e08cf-8b3b-4040-af88-86670ad5cbb1
SECOND LOCK ACQUIRED FOR 8e6e08cf-8b3b-4040-af88-86670ad5cbb1
SECOND LOCK RELEASED FOR 155eb578-bc91-4bee-830f-bab2e1363bb8
RELEASING LOCK FOR cb79f948-e4a4-4f75-ba31-fa2b743ce886
SECOND LOCK ACQUIRED FOR cb79f948-e4a4-4f75-ba31-fa2b743ce886
SECOND LOCK RELEASED FOR 8e6e08cf-8b3b-4040-af88-86670ad5cbb1
SECOND LOCK RELEASED FOR cb79f948-e4a4-4f75-ba31-fa2b743ce886

As you can see, the locks clearly are not shared across processes, as the locks are acquired and released independently of one another.

Is this a limitation of the library, or of NodeJs child processes?

Christian Speckner · Answer 1 · Fri Dec 13 2019 03:50:57 GMT+0800 (China Standard Time)

Hi Lucas!

What happens is that every child process creates its own instance of Mutex when it imports mutex.ts. In order to synchronise separate processes you will have to use IPC mechanisms like file locks or POSIX semaphores. async-mutex does not provide this --- it allows you to synchronise asynchronous tasks in a single javascript VM.