clean_consumed not functioning
pbygrave-lucid opened this issue · comments
Logstash information:
Please include the following information:
- Logstash version :
logstash 8.7.0
- Logstash installation source: Ansible install via geerlingguy roles
- How is Logstash being run (e.g. as a service/service manager:
systemd
- How was the Logstash Plugin installed: Via ansible roles install
- Plugin version:
logstash-input-dead_letter_queue **(2.0.0)**
JVM (e.g. java -version
):
- Java version:
openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1, mixed mode, sharing)
- JVM installation source (e.g. from the Operating System's package manager, from source, etc): Geerlingguy ansible role
OS version (uname -a
if on a Unix-like system):
20.04.1-Ubuntu
Description of the problem including expected versus actual behavior:
clean_consumed
option does not clear used segments as per documentation.
ll data/dead_letter_queue/main/
total 84
drwxr-xr-x 2 logstash logstash 4096 Apr 4 19:07 ./
drwxr-xr-x 4 logstash logstash 4096 Apr 4 18:46 ../
-rw-r--r-- 1 logstash logstash 0 Apr 4 19:06 .lock
-rw-r--r-- 1 logstash logstash 23578 Apr 4 18:55 1.log
-rw-r--r-- 1 logstash logstash 23576 Apr 4 18:57 2.log
-rw-r--r-- 1 logstash logstash 23578 Apr 4 19:07 3.log
-rw-r--r-- 1 logstash logstash 1 Apr 4 19:07 4.log.tmp
-rw-r--r-- 1 logstash logstash 0 Apr 4 18:46 dlq_reader.lock
These segments each contain 3 log messages that were rejected by Elasticsearch with a 400 mapping issues, they were correctly pseed to the DLQ, edited by a filter and re-processed, and I can see them now in the intended index. But these used segments have not been cleaned from disk, which is the intended use of clean_consumed
.
Steps to reproduce:
logstash.yml DLQ settings:
dead_letter_queue.enable: true
dead_letter_queue.max_bytes: 1024mb
dead_letter_queue.flush_interval: 5000
dead_letter_queue.storage_policy: drop_newer
dead_letter_queue.storage_policy: drop_newer
path.dead_letter_queue: /usr/share/logstash/data/dead_letter_queue
pipelines.yml file:
- pipeline.id: main
path.config: "/etc/logstash/conf.d/*.conf"
pipeline.workers: 3
dead_letter_queue.enable: true
- pipeline.id: dlq
path.config: "/etc/logstash/dlq_conf.d/*.conf"
pipeline.workers: 1
DLQ input file:
input {
dead_letter_queue {
path => "/usr/share/logstash/data/dead_letter_queue"
pipeline_id => "main"
clean_consumed => true
commit_offsets => true
}
}
Looking at the documentation here:
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-dead_letter_queue.html
It says that when clean_consumed
is set to true, then commit_offsets
must also be set to true, which I've done. It also states that sincedb
tracks the checkpoint of the DLQ, but I cannot find any trace of it writing any checkpointing files in <path.data>/plugins/inputs/dead_letter_queue
:
$ ll /usr/share/logstash/data/
total 12
drwxr-xr-x 3 logstash logstash 4096 Apr 3 21:21 ./
drwxr-xr-x 12 root root 4096 Apr 3 20:50 ../
drwxr-xr-x 4 logstash logstash 4096 Apr 4 18:46 dead_letter_queue/
The DQL is functioning correctly but without cleaning up the used log segments it is not fit for purpose to be released into my Production environment. The documentation here:
https://www.elastic.co/guide/en/logstash/current/dead-letter-queues.html#auto-clean
Does suggest maybe there's a formatting issue, but not sure whether that's an error in the docs.
Provide logs (if relevant):
[2023-04-04T19:31:50,788][INFO ][logstash.javapipeline ][dlq] Pipeline started {"pipeline.id"=>"dlq"}
[2023-04-04T19:31:50,800][DEBUG][logstash.javapipeline ] Pipeline started successfully {:pipeline_id=>"dlq", :thread=>"#<Thread:0x5874ce7d run>"}
[2023-04-04T19:31:50,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:31:51,200][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,235][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,238][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,243][DEBUG][logstash.filters.mutate ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,572][DEBUG][logstash.outputs.opensearch][dlq][570ab0ae3d602ac4e8f85245ef9b1b948e46d98a72b514ee9bd097e60bd9ea6d] Sending final bulk request for batch. {:action_count=>6, :payload_size=>17759, :content_length=>17759, :b
atch_offset=>0}
[2023-04-04T19:31:54,371][INFO ][logstash.agent ] Pipelines running {:count=>2, :running_pipelines=>[:dlq, :main], :non_running_pipelines=>[]}
[2023-04-04T19:31:55,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:00,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:10,832][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:15,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:25,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:30,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.