logpai / logparser

A machine learning toolkit for log parsing [ICSE'19, DSN'16]

Home Page:https://logparser.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The difference in number of records between raw data and parsed data

dino-chiio opened this issue · comments

Hi. I am studying your implementation for the Drain demo with the BGL dataset (full version).

However, the parsed dataset has a number of samples less than the raw dataset. While the raw dataset has 4,747,963 records, the parsed dataset has only 4,713,493 samples.

Could you please explain to me the reason for this issue?

There are some lines that are skipped because they cannot match the log format in config.