center-for-threat-informed-defense / tram

TRAM is an open-source platform designed to advance research into automating the mapping of cyber threat intelligence reports to MITRE ATT&CK®.

Home Page:https://ctid.mitre-engenuity.org/our-work/tram/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to add minimum sentence threshold for prediction of single-label SciBERT in Colab notebook?

hoangcuongnguyen2001 opened this issue · comments

I am currently trying to evaluate the SciBERT model (using in TRAM project) that I have successfully fine-tuned on my own CTI dataset, in this notebook: Predict single-label SciBERT; however, I just realised that for each report that I put into SciBERT, I got far more number of techniques, and therefore, higher number of false positives compared with what I got from the TRAM website.
As far as I know, there is no feature for minimum of accepted sentences threshold (ML_ACCEPT_THRESHOLD, which is equal to 4) like in TRAM website, as in this image below; so, I think that could be a reason why I got so many techniques with 1-2 sentences, and therefore higher false positive rate.

image

Thus, I would like to ask you that could you provide some guides to me to add this threshold into the Colab notebook, so that I could test the website? And more importantly, could you add this feature into the notebook for later users, so that they could have a more reliable way to test the performance of their model, without having to loading the model directly to TRAM's website?

Yes, as you noticed, there is no ML_ACCEPT_THRESHOLD for the SciBERT models. The original models for TRAM 1 were lightweight and could be trained at runtime, and so the threshold value was used to decide which techniques to train on. Since SciBERT is pre-trained, we don't have a similar feature for pruning the training data. But you could achieve the same effect by removing labels from your training data that don't have many examples.

One limitation of the SciBert approach is that it does not have a "null" label, i.e. absence of a technique. To address this, the notebook includes the "probability" field, which is a threshold on prediction probability. The model chooses the label with the highest probability, and if that probability is less than the threshold, then it predicts no label.

So if you are getting too many false positives, try increasing the probability threshold, e.g. try 0.95, 0.99, etc.