abstools / logreplay

A tool to replay a log file as a series of queries onto an HTTP endpoint.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Log processing is slow if the log file is large

nobeh opened this issue · comments

Currently, we open the log files in one go and try to sort them based on the timestamp. This makes the processing of log files depend on the size of the file which becomes slower as the file size grows.

If we can assume that the logs are already sorted (or even roughly sorted),
then we can start to use loading the log file using streaming and in batches.
Processing logs happen to loading the next N number of lines from the file
and so on until the log file is processed.