center-for-threat-informed-defense / tram

TRAM is an open-source platform designed to advance research into automating the mapping of cyber threat intelligence reports to MITRE ATT&CK®.

Home Page:https://ctid.mitre-engenuity.org/our-work/tram/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Where is the corpus of input classified threat reports?

Radu3000 opened this issue · comments

MITRE Att&ck knowledge base has already mapped TTPs (and other att&ck objects) to threat reports via so called Citations. Can such a corpus of "classified" threat report texts be made available as part of this repository?

  • Otherwise how can we measure the accuracy of TRAM?

Regards,
Radu

MITRE Att&ck knowledge base has already mapped TTPs (and other att&ck objects) to threat reports via so called Citations. Can such a corpus of "classified" threat report texts be made available as part of this repository?

The repository contains >10k sentences of labeled training data. That data is used to train the models that are built into TRAM. But it is not derived from ATT&CK

(The "procedure examples" data from ATT&CK has not been used in TRAM due to concern that it's not representative of real-world CTI reports, but we are open to feedback on this.)

Otherwise how can we measure the accuracy of TRAM?

Click on the "ML Admin" button and you can browse through each of the models. The model performance is reported using the F1 statistic with a train/test split. (F1 is a bit more useful than accuracy for imbalanced class distribution.) I'm interested in proposals/pull requests to improve the model evaluation, such as reporting precision/recall separately, producing confusion matrices of ATT&CK techniques, etc.

Closing due to inactivity. Please reopen if needed.