marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invariance test cannot be run using test.run_from_file()

farigys opened this issue · comments

Is there any way I can run Invariance tests from predictions saved in files? When I am trying to use test.run_from_file(), I am getting this following error:
AttributeError: 'INV' object has no attribute 'result_indexes'

Yes, Invariance tests should work with predictions saved from files. Are you using one of our release suites? If not, can you possibly share the test / suite in question?

I am not using the release suites as my task does not resemble any of the presented task. I am trying to create my own tests (as you've shown in tutorial 3). The first process works (wrapping the prediction model in a PredictionWrapper), but the second process is not. I can share my code here (which is pretty much the exact code in the tutorial):

t = Perturb.perturb(dataset, Perturb.add_typos)
test = INV(**t)
test.run_from_file('/tmp/softmax_preds.txt', file_format='softmax', overwrite=True)

You did not call test.to_raw_file. I'm guessing your softmax_preds.txt file contains predictions for the original dataset, but not for the test cases (which include examples with typos)

my softmax_preds.txt contains prediction for the test cases (example + typo perturbed data). I did not save the test cases in /tmp/raw_file.txt. May be that's the problem?

I am saving the raw text using test.to_raw_file and my softmax_preds.txt contains the class probabilities separated by space. I am getting this error:

File "run_checklist_from_saved_prediction.py", line 116, in <module>
    main()
  File "run_checklist_from_saved_prediction.py", line 112, in main
    test.run_from_file('/tmp/softmax_preds.txt', file_format='softmax', overwrite=True)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 324, in run_from_file
    self.run_from_preds_confs(preds, confs, overwrite=overwrite)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 293, in run_from_preds_confs
    self.update_expect()
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 129, in update_expect
    self.results.expect_results = self.expect(self)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 78, in expect
    return [fn(x, pred, confs, labels, meta) for x, pred, confs, labels, meta in zipped]
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 78, in <listcomp>
    return [fn(x, pred, confs, labels, meta) for x, pred, confs, labels, meta in zipped]
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 120, in expect_fn
    orig_pred = preds[0]
TypeError: 'int' object is not subscriptable

Edit: this is how my softmax_preds.txt look like:

0.026281951 0.3031675 0.64833623 0.01582296 0.006391341
0.022482133 0.20569728 0.74865955 0.017401822 0.0057589416
0.011106909 0.058008775 0.46179605 0.4471355 0.021952866
0.01102794 0.037073947 0.41210234 0.5221132 0.017682495
0.5810922 0.1305551 0.12312923 0.07375503 0.09146844

Ugh, would it be possible at all for you to share the first few examples from your dataset? i.e. print(dataset[:2])

Sure. This is the output of t.data[:3]:

['they will be different because the water will slow the sound waves down of the pitch making it queiter', 'they will be different ebcause the water will slow the sound waves down of the pitch making it queiter', "The more water there is the more sound will be absorbed by the water's density."]

I am really sorry! Solved it- typos should come in pairs in t, and I wasn't saving them like that. I am closing this issue. Thanks a lot for your help!