logpai / logparser

A machine learning toolkit for log parsing [ICSE'19, DSN'16]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parsed templates

HankKung opened this issue · comments

As mentioned in LogAnomaly:
The front 50% (according to the timestamps of logs) of the BGL dataset is used as the training set, which includes 257 log templates, and the rest 50% involving 503 templates is used as the testing set.

However, I got 1834 templates from Drain and 3000+ from Spell. Did I do something wrong here? Or should I filter templates that occurs one time out?

In the anomaly detection paper, the authors usually will use the ground truth (correct parsing results). While for existing parsers, it's possible to have parsing errors. For example, the one you mentioned. It could be caused by wrongly understanding a few constants.