Ziinc / crawldis-old

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

jobber: crawl jobs should hold config state across nodes

Ziinc opened this issue · comments

commented

Centralized crawling management, connected to all nodes. Each node has a process that interfaces with the management node(s)

Each job starts a certain number of Requestors & Processors.

v1

  • Run on same node with cluster

v2

  • should broadcast stats to management node
  • scaling requestors/processors with monitoring (linear increase/ buffer/ )
  • warning alerts
  • web api
commented

Implemented with Jobber, using delta crdt.

Jobber provides crawl job management functionality.

To connect with a management interface and listen for management commands, a separate module namespace should be used. Connector? Bridger? Bridge? Attacher? Listener? I'm favouring Listener.