logstash-plugins / logstash-input-dead_letter_queue

Logstash's Dead Letter Queue Input

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clean_consumed not functioning

pbygrave-lucid opened this issue · comments

Logstash information:

Please include the following information:

  1. Logstash version : logstash 8.7.0
  2. Logstash installation source: Ansible install via geerlingguy roles
  3. How is Logstash being run (e.g. as a service/service manager: systemd
  4. How was the Logstash Plugin installed: Via ansible roles install
  5. Plugin version: logstash-input-dead_letter_queue **(2.0.0)**

JVM (e.g. java -version):

  1. Java version:
openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1, mixed mode, sharing)
  1. JVM installation source (e.g. from the Operating System's package manager, from source, etc): Geerlingguy ansible role

OS version (uname -a if on a Unix-like system):

20.04.1-Ubuntu

Description of the problem including expected versus actual behavior:

clean_consumed option does not clear used segments as per documentation.

ll data/dead_letter_queue/main/
total 84
drwxr-xr-x 2 logstash logstash  4096 Apr  4 19:07 ./
drwxr-xr-x 4 logstash logstash  4096 Apr  4 18:46 ../
-rw-r--r-- 1 logstash logstash     0 Apr  4 19:06 .lock
-rw-r--r-- 1 logstash logstash 23578 Apr  4 18:55 1.log
-rw-r--r-- 1 logstash logstash 23576 Apr  4 18:57 2.log
-rw-r--r-- 1 logstash logstash 23578 Apr  4 19:07 3.log
-rw-r--r-- 1 logstash logstash     1 Apr  4 19:07 4.log.tmp
-rw-r--r-- 1 logstash logstash     0 Apr  4 18:46 dlq_reader.lock

These segments each contain 3 log messages that were rejected by Elasticsearch with a 400 mapping issues, they were correctly pseed to the DLQ, edited by a filter and re-processed, and I can see them now in the intended index. But these used segments have not been cleaned from disk, which is the intended use of clean_consumed.

Steps to reproduce:

logstash.yml DLQ settings:

dead_letter_queue.enable: true
dead_letter_queue.max_bytes: 1024mb
dead_letter_queue.flush_interval: 5000
dead_letter_queue.storage_policy: drop_newer
dead_letter_queue.storage_policy: drop_newer
path.dead_letter_queue: /usr/share/logstash/data/dead_letter_queue

pipelines.yml file:

- pipeline.id: main
  path.config: "/etc/logstash/conf.d/*.conf"
  pipeline.workers: 3
  dead_letter_queue.enable: true
- pipeline.id: dlq
  path.config: "/etc/logstash/dlq_conf.d/*.conf"
  pipeline.workers: 1

DLQ input file:

input {
  dead_letter_queue {
    path => "/usr/share/logstash/data/dead_letter_queue"
    pipeline_id => "main"
    clean_consumed => true
    commit_offsets => true
  }
}

Looking at the documentation here:
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-dead_letter_queue.html

It says that when clean_consumed is set to true, then commit_offsets must also be set to true, which I've done. It also states that sincedb tracks the checkpoint of the DLQ, but I cannot find any trace of it writing any checkpointing files in <path.data>/plugins/inputs/dead_letter_queue:

$ ll /usr/share/logstash/data/
total 12
drwxr-xr-x  3 logstash logstash 4096 Apr  3 21:21 ./
drwxr-xr-x 12 root     root     4096 Apr  3 20:50 ../
drwxr-xr-x  4 logstash logstash 4096 Apr  4 18:46 dead_letter_queue/

The DQL is functioning correctly but without cleaning up the used log segments it is not fit for purpose to be released into my Production environment. The documentation here:

https://www.elastic.co/guide/en/logstash/current/dead-letter-queues.html#auto-clean

Does suggest maybe there's a formatting issue, but not sure whether that's an error in the docs.

Provide logs (if relevant):

[2023-04-04T19:31:50,788][INFO ][logstash.javapipeline    ][dlq] Pipeline started {"pipeline.id"=>"dlq"}
[2023-04-04T19:31:50,800][DEBUG][logstash.javapipeline    ] Pipeline started successfully {:pipeline_id=>"dlq", :thread=>"#<Thread:0x5874ce7d run>"}
[2023-04-04T19:31:50,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:31:51,200][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,235][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,238][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,243][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,572][DEBUG][logstash.outputs.opensearch][dlq][570ab0ae3d602ac4e8f85245ef9b1b948e46d98a72b514ee9bd097e60bd9ea6d] Sending final bulk request for batch. {:action_count=>6, :payload_size=>17759, :content_length=>17759, :b
atch_offset=>0}
[2023-04-04T19:31:54,371][INFO ][logstash.agent           ] Pipelines running {:count=>2, :running_pipelines=>[:dlq, :main], :non_running_pipelines=>[]}
[2023-04-04T19:31:55,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:00,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:10,832][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:15,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:25,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:30,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.