Kibana in `dev` returns duplicate hits for a single log record
hannes-ucsc opened this issue · comments
@hannes-ucsc are you still seeing this behavior? I can't seem to replicate this.
Yes, occasionally. How did you attempt to reproduce it?
@hannes-ucsc I dug through some log lines similar to the ones you showed and tried to find duplicates
@mweiden I saw this at considerable scale yesterday. I was expecting 130k log messages for a particular Azul operation but got 160k. Here is a query showing such a message that was indexed in quadruplicate:
The key takeaway is that we have four distinct values for _id
(the ES document identifier) but the same value for @id
(the CloudWatch log record identifier IIRC). Can we use @id
to populate _id
?
I think this may have something to do with an OOM issue we're seeing in the dev deployment...
OOM report
REPORT RequestId: 04f76c37-fca4-11e8-a7ec-e52492b51ce1 Duration: 5983.41 ms Billed Duration: 6000 ms Memory Size: 1024 MB Max Memory Used: 1024 MB
This is retried
RequestId: 04f76c37-fca4-11e8-a7ec-e52492b51ce1 Process exited before completing request
...
RequestId: 04f76c37-fca4-11e8-a7ec-e52492b51ce1 Process exited before completing request
...
RequestId: 04f76c37-fca4-11e8-a7ec-e52492b51ce1 Process exited before completing request