The difference in number of records between raw data and parsed data
dino-chiio opened this issue · comments
NGUYEN, Van Tuan commented
Hi. I am studying your implementation for the Drain demo with the BGL dataset (full version).
However, the parsed dataset has a number of samples less than the raw dataset. While the raw dataset has 4,747,963 records, the parsed dataset has only 4,713,493 samples.
Could you please explain to me the reason for this issue?
zhujiem commented
There are some lines that are skipped because they cannot match the log format in config.