center-for-threat-informed-defense / tram

TRAM is an open-source platform designed to advance research into automating the mapping of cyber threat intelligence reports to MITRE ATT&CK®.

Home Page:https://ctid.mitre-engenuity.org/our-work/tram/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'LogisticRegression' object has no attribute 'classes_'

Dovakin94 opened this issue · comments

I followed the "Developer setup" instructions for Windows.
After uploading a pdf report I ran the pipeline command and got the following error
image

@Dovakin94 I was able to reproduce your issue on a Windows system, yet the problem is more broad since it was missing documentation. I updated the notes for developers and that should fix your pipeline commands.

Hi @emmanvg ,

Had to open this after more than a year.

I ran into the same issue. I may have missed the part of the updated notes that should fix the pipeline commands. I'm developing on WSL.

Here are my screen shots.
image
image

@Dovakin94 , was your issue resolved a year ago? If it did, how did you resolve it?

I also feel that the documentation is slightly lacking between step 11 and step 12.

11. Open the application in your web browser.
    1. Navigate to <http://localhost:8000> and use the superuser to log in
12. In a separate terminal window, run the ML pipeline

     ```sh
     cd tram/
     source venv/bin/activate
     tram pipeline run
     ```

I am not sure what the pipeline is running.
Are we suppose to upload an article into the web UI before we run the pipeline?

Cheers!

Hi @wei-ann-Github, I think the cause is that it cannot load serialized models from disk. Did you run these commands during set up?

tram attackdata load
tram pipeline load-training-data
tram pipeline train --model nb
tram pipeline train --model logreg
tram pipeline train --model nn_cls

After training you should see some serialized models saved to disk. Can you run this next command and tell me if you see these .pkl files?

$ ls -lah data/ml-models
total 12776
drwxr-xr-x   7 mhaase  staff   224B May  6  2022 ./
drwxr-xr-x  11 mhaase  staff   352B Oct  7  2022 ../
-rw-r--r--   1 mhaase  staff     0B May  6  2022 .gitkeep
-rw-r--r--   1 mhaase  staff    12K Oct  7  2022 DummyModel.pkl
-rw-r--r--   1 mhaase  staff   917K May  5  2022 LogisticRegressionModel.pkl
-rw-r--r--   1 mhaase  staff   3.7M Mar  1  2022 MLPClassifierModel.pkl
-rw-r--r--   1 mhaase  staff   1.6M Mar  1  2022 NaiveBayesModel.pkl

That should help us figure out what the root cause is.

I am not sure what the pipeline is running.
Are we suppose to upload an article into the web UI before we run the pipeline?

The tram pipeline run command is an infinite loop that checks for new reports and submits them for labeling. So you need to run that command and then upload articles in the web UI. After uploading, wait a few seconds and then the report should be analyzed and the results are visible in the web UI.

Thank you @mehaase , tram pipeline run did not run in an infinite loop for me. I have attached what I see in my terminal here:

image

Hi @wei-ann-Github, my previous comment was inaccurate. You are correct, tram pipeline run runs any jobs that are currently queued, and then it quits. You can run tram pipeline run --run-forever to make put it an infinite loop.

The output you are seeing suggests that there are no reports in the queue. Click the upload report button and upload a document (e.g. PDF format). It should show queued status.

Screenshot 2023-07-26 at 9 32 09 AM

Once you see queued status, you can run the pipeline to process that report. When processing is complete, refresh the UI and click "Analyze" to see the results. If you are still having problems after this, please open a new issue.

Cheers @mehaase , thank you for verifying and thank you for the resolution :) looking forward to tram 2!