Automated Cleanup of 'druid_supervisors' metadata table not working

Question

Automated Cleanup of 'druid_supervisors' metadata table not working

mustafakmal opened this issue 2 years ago · comments

Mustafa Akmal commented 2 years ago

Affected Version

I am using druid-0.0.8

Description

I have a 5 node cluster

1 master node (overlord/coordinator)
1 query node (router/broker)
3 data nodes (historical/middlemanager)

I am trying to enable 'Automated Cleanup' of metadata tables. I have configured Druid to use Postgres as a Metadata store. To enable automated cleanup I went through the documentation and have configured the following configurations on the coordinator.

druid.service=druid/coordinator
# HTTP server threads
druid.coordinator.startDelay=PT30S
druid.coordinator.period=PT30S
# Configure this coordinator to also run as Overlord
druid.coordinator.asOverlord.enabled=true
druid.coordinator.asOverlord.overlordService=druid/overlord
druid.indexer.queue.startDelay=PT30S
druid.indexer.runner.type=remote
#Metadata Autocleanup
druid.coordinator.period.indexingPeriod=PT5M
druid.coordinator.kill.on=true
druid.coordinator.kill.period=PT10M
druid.coordinator.kill.durationToRetain=PT10M
druid.coordinator.kill.maxSegments=1000
druid.coordinator.period.metadataStoreManagementPeriod=PT10M
killAllDataSources=true

druid.coordinator.kill.supervisor.on=true
druid.coordinator.kill.audit.on=true
druid.coordinator.kill.rule.on=true
druid.coordinator.kill.compaction.on=true
druid.coordinator.kill.datasource.on=true

druid.coordinator.kill.supervisor.period=PT10M
druid.coordinator.kill.audit.period=PT10M
druid.coordinator.kill.rule.period=PT10M
druid.coordinator.kill.compaction.period=PT10M
druid.coordinator.kill.datasource.period=PT10M

druid.coordinator.kill.supervisor.durationToRetain=PT10M
druid.coordinator.kill.audit.durationToRetain=PT10M
druid.coordinator.kill.rule.durationToRetain=PT10M
druid.coordinator.kill.datasource.durationToRetain=PT10M

The 'druid_segments' & 'druid_pendingsegments' tables do get cleaned according to the 'durationToRetain' set and once the segments are in an unused state.

However, the 'druid_supervisors' retains all records even though I have terminated the supervisor and the 'durationToRetain' has passed. The coordinator logs dont show any errors. If there are any details needed let me know.

I am attaching a screenshot of the 'druid_supervisors' table.

Arun Cherla · Answer 1 · Sun Jun 19 2022 02:37:00 GMT+0800 (China Standard Time)

This doesnt appear to be an issue with the operator. This might be a bug with the druid version you are using, can you provide the druid version you are using?