druid-io / druid-operator

Druid Kubernetes Operator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automated Cleanup of 'druid_supervisors' metadata table not working

mustafakmal opened this issue · comments

Affected Version

I am using druid-0.0.8

Description

I have a 5 node cluster

  • 1 master node (overlord/coordinator)
  • 1 query node (router/broker)
  • 3 data nodes (historical/middlemanager)

I am trying to enable 'Automated Cleanup' of metadata tables. I have configured Druid to use Postgres as a Metadata store. To enable automated cleanup I went through the documentation and have configured the following configurations on the coordinator.

druid.service=druid/coordinator
# HTTP server threads
druid.coordinator.startDelay=PT30S
druid.coordinator.period=PT30S
# Configure this coordinator to also run as Overlord
druid.coordinator.asOverlord.enabled=true
druid.coordinator.asOverlord.overlordService=druid/overlord
druid.indexer.queue.startDelay=PT30S
druid.indexer.runner.type=remote
#Metadata Autocleanup
druid.coordinator.period.indexingPeriod=PT5M
druid.coordinator.kill.on=true
druid.coordinator.kill.period=PT10M
druid.coordinator.kill.durationToRetain=PT10M
druid.coordinator.kill.maxSegments=1000
druid.coordinator.period.metadataStoreManagementPeriod=PT10M
killAllDataSources=true

druid.coordinator.kill.supervisor.on=true
druid.coordinator.kill.audit.on=true
druid.coordinator.kill.rule.on=true
druid.coordinator.kill.compaction.on=true
druid.coordinator.kill.datasource.on=true

druid.coordinator.kill.supervisor.period=PT10M
druid.coordinator.kill.audit.period=PT10M
druid.coordinator.kill.rule.period=PT10M
druid.coordinator.kill.compaction.period=PT10M
druid.coordinator.kill.datasource.period=PT10M

druid.coordinator.kill.supervisor.durationToRetain=PT10M
druid.coordinator.kill.audit.durationToRetain=PT10M
druid.coordinator.kill.rule.durationToRetain=PT10M
druid.coordinator.kill.datasource.durationToRetain=PT10M

The 'druid_segments' & 'druid_pendingsegments' tables do get cleaned according to the 'durationToRetain' set and once the segments are in an unused state.

However, the 'druid_supervisors' retains all records even though I have terminated the supervisor and the 'durationToRetain' has passed. The coordinator logs dont show any errors. If there are any details needed let me know.

I am attaching a screenshot of the 'druid_supervisors' table.
image

This doesnt appear to be an issue with the operator. This might be a bug with the druid version you are using, can you provide the druid version you are using?