OptimalBits / bull

Premium Queue package for handling distributed jobs and messages in NodeJS.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] removeRepeatableByKey does not remove the last delayed job

annieTechno opened this issue · comments

Description

When removing a repeatable job by key using the function removeRepeatableByKey there is still one delayed job left in the queue. There were other issues reported like this and seemed to be fixed but the problem persists with version 4.1.5. The problem is that the bug is happening when the job was run at least once before removing then the delayed job is staying even after removal.

Code:
const job = await queue.add( { identifier }, { jobId: '${identifier}:${pattern}', repeat: { tz: scheduling.tz, every: 3000 }, }, );

Then when the job has run at least once, call removal

await queue.removeRepeatableByKey(job.key);

Bull version

4.1.5

Additional information

It is critical as it is happening in production and causing not needed runs for the jobs.
The only workaround we have now is to get all delayed jobs by getDelayed and check the jobId that corresponds to our job. But it is very poor performance wise and should be fixed from library side. We are not able to clean all delayed jobs as there are other jobs running.

@manast Tagging here as we are experiencing this problem after the fix.

It would be great if you could produce the complete test case that reproduces the issue, as I have not been able to reproduce this in the past.

It would be great if you could produce the complete test case that reproduces the issue, as I have not been able to reproduce this in the past.

The key to reproducing is to let the repeatable job run at least once so that it will create the delayed job then remove it by removeRepeatableByKey.
The complete code is

Step 1: const job = await queue.add( { identifier: '1' }, { jobId: '1', repeat: { every: 60000 }, }, );
Step 2: Then we store the job.opts.repeat.key
Step3: Let job run at least once / this is important
Step4: And on user action we do
await queue.removeRepeatableByKey(job.key);

Expected behavour: Once the job is removed all jobs related to it will be removed and no processing will happen.
Current behavour: The job still processes once as there is on delayed job still in the queue.

@annieTechno thanks for the information. But can you write the actual code that reproduces the issue? (we had to write it ourselves otherwise with the risk of not being able to reproduce it).

Here is the code

// bull.service.ts

import { Injectable } from '@nestjs/common';
import { InjectQueue } from '@nestjs/bull';
import { Queue, JobOptions, Job } from 'bull';

@Injectable()
export class BullService {
  constructor(@InjectQueue('your_queue_name') private readonly queue: Queue) {}

  async scheduleJob(): Promise<Job> {
    const jobOptions: JobOptions = {
      repeat: { every: 60000 },
    };

    const job = await this.queue.add(
      { identifier: '1' },
      { jobId: '1', ...jobOptions },
    );

    return job;
  }

  async descheduleJob(job: Job): Promise<void> {
    await this.queue.removeRepeatableByKey(job.opts?.repeat?.key);
  }
}

and here is the calling, the deschedule has to happen when the job has run at least once

// scheduled-task.service.ts

import { Injectable, OnModuleDestroy, OnModuleInit } from '@nestjs/common';
import { BullService } from './bull.service';

@Injectable()
export class ScheduledTaskService implements OnModuleInit, OnModuleDestroy {
  private scheduledJob: Job;

  constructor(private readonly bullService: BullService) {}

  async onModuleInit(): Promise<void> {
    // Schedule the job when the module is initialized
    this.scheduledJob = await this.bullService.scheduleJob();
  }

  async onModuleDestroy(): Promise<void> {
    // Deschedule the job when the module is destroyed
    if (this.scheduledJob) {
      await this.bullService.descheduleJob(this.scheduledJob);
    }
  }
}

@manast I've also switched to bullmq package with latest version and the bug exists there too both in bull and bullmq.

Since you mentioned that you have tried with BullMQ too, here we have a test that checks precisely that the delayed job is also removed:
https://github.com/taskforcesh/bullmq/blob/master/tests/test_repeat.ts#L1179-L1180

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.