logstash-plugins / logstash-input-s3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Files being unprocessed with the same last modified timestamp

kaisecheng opened this issue · comments

The precision of S3 timestamp is in second. Files with the same last_modified could be processed in two iterations. However, for every processed file, the timestamp will be written in sincedb. In the next iterations, if the timestamp is smaller or equal (<=) to sincedb, files will be left unprocessed. Therefore from time to time, users see some files with the same timestamp remain to unpick.

Related PR
#57
#189
#61
#192

Fixed in v3.6.0

@kaisecheng This is great, but is it merged into logstash? meaning which version of logstash uses the new plugin?

@yogevyuval Logstash 7.13

The problem still exists in logstash-8.2.3 with logstash-input-s3 on 3.8.3. Files with same last modified timestamp may be ignore, and thereby lead to data lost.

@sunfriendli Could you create a new issue in this repo with reproducing steps and log for further investigation?

@sunfriendli Could you create a new issue in this repo with reproducing steps and log for further investigation?

Hello @kaisecheng , I created a new issue at #244