crawlab-team / crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架

Home Page:https://www.crawlab.cn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

crawlab 重启之后,重启前 pending 的任务无法继续执行(After crawlab restart, tasks pending before restart cannot continue)

ma-pony opened this issue · comments

Bug 描述
当我重启crawlab之后,在重启之前处于pending的任务无法成功调度,一直处于pending状态。

复现步骤
该 Bug 复现步骤如下

  1. 保证当前tasks列表中有pending的任务,且node为空
  2. 重启master和worker节点
  3. 重启成功后pending的任务会一直pending

期望结果
重启crawlab之后,之前处于pending的任务可以继续调度执行。


Describe the bug
When I restarted crawlab, the tasks that were in pending before the restart could not be successfully scheduled and stayed in pending state.

To Reproduce
Steps to reproduce the behavior:

  1. Ensure that there is a pending task in the current tasks list and that the node is empty.
  2. Restart the master and worker nodes.
  3. After a successful restart, the pending tasks will remain pending.

Expected behavior
After restarting crawlab, tasks that were previously in pending can continue to be scheduled for execution.

@ma-pony hey!
I would appreciate if you provide relevant translation for your issue in ENG.
I'm not sure if Google Translator handle it properly.
Thanks)

@ma-pony hey! I would appreciate if you provide relevant translation for your issue in ENG. I'm not sure if Google Translator handle it properly. Thanks)

Thanks for the suggestion!

@ma-pony thanks for your translation.
Have the same issues. In my case I have > 400 scheduled tasks and lot of them staying in pending state and never executed.
Seems like Redis queues don't clear in proper way.

@ma-pony thanks for your translation. Have the same issues. In my case I have > 400 scheduled tasks and lot of them staying in pending state and never executed. Seems like Redis queues don't clear in proper way.

I have 6k+ scheduled tasks...

@ma-pony 6k+ scheduled tasks it is impressive.
What kind of Crawlab installation you using Docker based or Kubernetes and what kind of infrastructure you using.
Could you please share some details?

@ma-pony 6k+ scheduled tasks it is impressive. What kind of Crawlab installation you using Docker based or Kubernetes and what kind of infrastructure you using. Could you please share some details?

https://docs.crawlab.cn/en/guide/installation/kubernetes.html

four 8C16G machines, deployed using k8s, one master node and six worker node, with worker configured with 4C16G and master configured with 6C12G