'LogisticRegression' object has no attribute 'classes_'

Question

'LogisticRegression' object has no attribute 'classes_'

Dovakin94 opened this issue 2 years ago · comments

I followed the "Developer setup" instructions for Windows.
After uploading a pdf report I ran the pipeline command and got the following error

Emmanuelle Vargas Gonzalez · Answer 1 · Tue Feb 08 2022 22:55:02 GMT+0800 (China Standard Time)

@Dovakin94 I was able to reproduce your issue on a Windows system, yet the problem is more broad since it was missing documentation. I updated the notes for developers and that should fix your pipeline commands.

Wei Ann · Answer 2 · Sat Jul 15 2023 19:36:00 GMT+0800 (China Standard Time)

Hi @emmanvg ,

Had to open this after more than a year.

I ran into the same issue. I may have missed the part of the updated notes that should fix the pipeline commands. I'm developing on WSL.

Here are my screen shots.

@Dovakin94 , was your issue resolved a year ago? If it did, how did you resolve it?

I also feel that the documentation is slightly lacking between step 11 and step 12.

11. Open the application in your web browser.
    1. Navigate to <http://localhost:8000> and use the superuser to log in
12. In a separate terminal window, run the ML pipeline

     ```sh
     cd tram/
     source venv/bin/activate
     tram pipeline run
     ```

I am not sure what the pipeline is running.
Are we suppose to upload an article into the web UI before we run the pipeline?

Cheers!

Mark E. Haase · Answer 3 · Tue Jul 18 2023 02:06:15 GMT+0800 (China Standard Time)

Hi @wei-ann-Github, I think the cause is that it cannot load serialized models from disk. Did you run these commands during set up?

tram attackdata load
tram pipeline load-training-data
tram pipeline train --model nb
tram pipeline train --model logreg
tram pipeline train --model nn_cls

After training you should see some serialized models saved to disk. Can you run this next command and tell me if you see these .pkl files?

$ ls -lah data/ml-models
total 12776
drwxr-xr-x   7 mhaase  staff   224B May  6  2022 ./
drwxr-xr-x  11 mhaase  staff   352B Oct  7  2022 ../
-rw-r--r--   1 mhaase  staff     0B May  6  2022 .gitkeep
-rw-r--r--   1 mhaase  staff    12K Oct  7  2022 DummyModel.pkl
-rw-r--r--   1 mhaase  staff   917K May  5  2022 LogisticRegressionModel.pkl
-rw-r--r--   1 mhaase  staff   3.7M Mar  1  2022 MLPClassifierModel.pkl
-rw-r--r--   1 mhaase  staff   1.6M Mar  1  2022 NaiveBayesModel.pkl

That should help us figure out what the root cause is.

I am not sure what the pipeline is running.
Are we suppose to upload an article into the web UI before we run the pipeline?

The tram pipeline run command is an infinite loop that checks for new reports and submits them for labeling. So you need to run that command and then upload articles in the web UI. After uploading, wait a few seconds and then the report should be analyzed and the results are visible in the web UI.

Wei Ann · Answer 4 · Sat Jul 22 2023 15:57:44 GMT+0800 (China Standard Time)

Thank you @mehaase , tram pipeline run did not run in an infinite loop for me. I have attached what I see in my terminal here:

Mark E. Haase · Answer 5 · Wed Jul 26 2023 21:35:19 GMT+0800 (China Standard Time)

Hi @wei-ann-Github, my previous comment was inaccurate. You are correct, tram pipeline run runs any jobs that are currently queued, and then it quits. You can run tram pipeline run --run-forever to make put it an infinite loop.

The output you are seeing suggests that there are no reports in the queue. Click the upload report button and upload a document (e.g. PDF format). It should show queued status.

Once you see queued status, you can run the pipeline to process that report. When processing is complete, refresh the UI and click "Analyze" to see the results. If you are still having problems after this, please open a new issue.

Wei Ann · Answer 6 · Fri Jul 28 2023 21:30:07 GMT+0800 (China Standard Time)

Cheers @mehaase , thank you for verifying and thank you for the resolution :) looking forward to tram 2!