Ground-truth prediction matching

Question

Ground-truth prediction matching

mattmagic149 opened this issue 4 years ago · comments

Hi,

I am currently working on a segmentation evaluation for grand-challenge using evalutils.
Locally the evaluation works perfectly fine. Once uploaded to grand-challenge I get this error:

Output: Traceback (most recent call last): File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/evaluation/evaluation.py", line 152, in <module> CadaSegmentation().evaluate() File "/home/evaluator/.local/lib/python3.7/site-packages/evalutils/evalutils.py", line 415, in evaluate self.cross_validate() File "/home/evaluator/.local/lib/python3.7/site-packages/evalutils/evalutils.py", line 575, in cross_validate self._raise_missing_predictions_error(missing=missing) File "/home/evaluator/.local/lib/python3.7/site-packages/evalutils/evalutils.py", line 481, in _raise_missing_predictions_error raise ValidationError(message) evalutils.exceptions.ValidationError: Predictions missing: you did not submit predictions for [hash_ground_truth -8.08165e+17 path_ground_truth /opt/evaluation/ground-truth/A151_labeledMasks... hash_prediction NaN path_prediction NaN _merge left_only Name: 22, dtype: object]. Please try again.

Apparently, my ground-truth is not matched by the prediction. However, this seems to be messy to debug, since it works locally when I run with ./test.sh.

Here is some further information on my code and data:
The dataset consists of compressed NIFITI images:

A022_labeledMasks.nii.gz
A144_M_labeledMasks.nii.gz
...
A151_labeledMasks.nii.gz

These files are located as described in my ground-truth folder. For the submission upload I zip all files.

My implementation inherited from the ClassificationEvaluation class, as described in the basic example. Additionally, I use the following file_sorter_key:

lambda fname: fname.stem.split('.')[0]

Cheers,

Matt

James Meakin · Answer 1 · Thu Jun 18 2020 01:28:24 GMT+0800 (China Standard Time)

I've taken a look at your submission and it only contained 1 image (https://cada-as.grand-challenge.org/evaluation/submissions/f89901c8-9962-4380-8e9a-4b85ce3e9603/), so the error seems to be correct. You need to include all images in your submission zip file.

mattmagic149 · Answer 2 · Thu Jun 18 2020 05:11:10 GMT+0800 (China Standard Time)

I've taken a look at your submission and it only contained 1 image (https://cada-as.grand-challenge.org/evaluation/submissions/f89901c8-9962-4380-8e9a-4b85ce3e9603/), so the error seems to be correct. You need to include all images in your submission zip file.

This was one test run containing only the file the exception is thrown with. My other test submits contain all files. E.g. https://cada-as.grand-challenge.org/evaluation/submissions/487deb7a-dec6-4370-b46b-ecff76d86712/

James Meakin · Answer 3 · Thu Jun 18 2020 15:07:37 GMT+0800 (China Standard Time)

Ah, ok, I will look into it.

James Meakin · Answer 4 · Thu Jun 18 2020 16:23:26 GMT+0800 (China Standard Time)

Ok - this is a problem with your NiftiLoader. If a Loader cannot load an image it needs to raise a FileLoaderError rather than return None (line 27 of your evaluation.py), otherwise evalutils still thinks that this is a valid input file.

Why is this occurring on grand challenge and not locally? You're on a mac so you have .DS_store files everywhere, they are notorious for causing problems with submissions so on grand challenge we remove them from the submissions when unzipping. When you run locally you have .DS_store files in both the input and ground truth, but when you run on grand challenge the .DS_store file is only in your ground truth folder as it was included at docker build time.

So, when you run locally everything matches up, but you have an extra row in your _cases data table, which I think you exclude by doing:

    if gt_path.suffix != '.gz' and gt_path.suffix != '.nii':
            return None

in score_case on line ~55. This should not be necessary as the FileLoader should be doing validation, so that if block can be removed too.

When you run on grand challenge, as the .DS_store file is not there so the predictions cannot be matched up:

                                      path_ground_truth  hash_prediction                    path_prediction     _merge
0         /opt/evaluation/ground-truth/.DS_Store    -1.+18    /input/A022_labeledMasks.nii.gz       both
1         /opt/evaluation/ground-truth/A022_labeledMasks...    -8.+18    /input/A030_labeledMasks.nii.gz       both
2         /opt/evaluation/ground-truth/A030_labeledMasks...    -3.+18    /input/A036_labeledMasks.nii.gz       both
3         /opt/evaluation/ground-truth/A036_labeledMasks...     8.+18    /input/A065_labeledMasks.nii.gz       both
4         /opt/evaluation/ground-truth/A065_labeledMasks...    -4.+17    /input/A075_labeledMasks.nii.gz       both
5         /opt/evaluation/ground-truth/A075_labeledMasks...     7.+18    /input/A099_labeledMasks.nii.gz       both
6         /opt/evaluation/ground-truth/A099_labeledMasks...     2.+18    /input/A101_labeledMasks.nii.gz       both
7         /opt/evaluation/ground-truth/A101_labeledMasks...    -5.+18    /input/A104_labeledMasks.nii.gz       both
8         /opt/evaluation/ground-truth/A104_labeledMasks...    -3.+18    /input/A107_labeledMasks.nii.gz       both
9         /opt/evaluation/ground-truth/A107_labeledMasks...     6.+18    /input/A109_labeledMasks.nii.gz       both
10       /opt/evaluation/ground-truth/A109_labeledMasks...    -1.+18    /input/A131_labeledMasks.nii.gz       both
11       /opt/evaluation/ground-truth/A131_labeledMasks...     6.+18    /input/A139_labeledMasks.nii.gz       both
12       /opt/evaluation/ground-truth/A139_labeledMasks...     3.+17    /input/A141_labeledMasks.nii.gz       both
13       /opt/evaluation/ground-truth/A141_labeledMasks...     1.+18  /input/A144_L_labeledMasks.nii.gz       both
14       /opt/evaluation/ground-truth/A144_L_labeledMas...    -4.+18  /input/A144_M_labeledMasks.nii.gz       both
15       /opt/evaluation/ground-truth/A144_M_labeledMas...     7.+18    /input/A145_labeledMasks.nii.gz       both
16       /opt/evaluation/ground-truth/A145_labeledMasks...    -1.+18    /input/A146_labeledMasks.nii.gz       both
17       /opt/evaluation/ground-truth/A146_labeledMasks...    -3.+18    /input/A147_labeledMasks.nii.gz       both
18       /opt/evaluation/ground-truth/A147_labeledMasks...     6.+18    /input/A148_labeledMasks.nii.gz       both
19       /opt/evaluation/ground-truth/A148_labeledMasks...     4.+18    /input/A149_labeledMasks.nii.gz       both
20       /opt/evaluation/ground-truth/A149_labeledMasks...     8.+18    /input/A150_labeledMasks.nii.gz       both
21       /opt/evaluation/ground-truth/A150_labeledMasks...    -9.+18    /input/A151_labeledMasks.nii.gz       both
22       /opt/evaluation/ground-truth/A151_labeledMasks...              NaN                                NaN  left_only

Even if we didn't remove the .DS_store files you would have hit this problem with any submission from a non-OSX user.

mattmagic149 · Answer 5 · Thu Jun 18 2020 17:23:41 GMT+0800 (China Standard Time)

Ok - this is a problem with your NiftiLoader. If a Loader cannot load an image it needs to raise a FileLoaderError rather than return None (line 27 of your evaluation.py), otherwise evalutils still thinks that this is a valid input file.

Why is this occurring on grand challenge and not locally? You're on a mac so you have .DS_store files everywhere, they are notorious for causing problems with submissions so on grand challenge we remove them from the submissions when unzipping. When you run locally you have .DS_store files in both the input and ground truth, but when you run on grand challenge the .DS_store file is only in your ground truth folder as it was included at docker build time.

So, when you run locally everything matches up, but you have an extra row in your _cases data table, which I think you exclude by doing:

    if gt_path.suffix != '.gz' and gt_path.suffix != '.nii':
            return None

in score_case on line ~55. This should not be necessary as the FileLoader should be doing validation, so that if block can be removed too.

When you run on grand challenge, as the .DS_store file is not there so the predictions cannot be matched up:

                                      path_ground_truth  hash_prediction                    path_prediction     _merge
0         /opt/evaluation/ground-truth/.DS_Store    -1.+18    /input/A022_labeledMasks.nii.gz       both
1         /opt/evaluation/ground-truth/A022_labeledMasks...    -8.+18    /input/A030_labeledMasks.nii.gz       both
2         /opt/evaluation/ground-truth/A030_labeledMasks...    -3.+18    /input/A036_labeledMasks.nii.gz       both
3         /opt/evaluation/ground-truth/A036_labeledMasks...     8.+18    /input/A065_labeledMasks.nii.gz       both
4         /opt/evaluation/ground-truth/A065_labeledMasks...    -4.+17    /input/A075_labeledMasks.nii.gz       both
5         /opt/evaluation/ground-truth/A075_labeledMasks...     7.+18    /input/A099_labeledMasks.nii.gz       both
6         /opt/evaluation/ground-truth/A099_labeledMasks...     2.+18    /input/A101_labeledMasks.nii.gz       both
7         /opt/evaluation/ground-truth/A101_labeledMasks...    -5.+18    /input/A104_labeledMasks.nii.gz       both
8         /opt/evaluation/ground-truth/A104_labeledMasks...    -3.+18    /input/A107_labeledMasks.nii.gz       both
9         /opt/evaluation/ground-truth/A107_labeledMasks...     6.+18    /input/A109_labeledMasks.nii.gz       both
10       /opt/evaluation/ground-truth/A109_labeledMasks...    -1.+18    /input/A131_labeledMasks.nii.gz       both
11       /opt/evaluation/ground-truth/A131_labeledMasks...     6.+18    /input/A139_labeledMasks.nii.gz       both
12       /opt/evaluation/ground-truth/A139_labeledMasks...     3.+17    /input/A141_labeledMasks.nii.gz       both
13       /opt/evaluation/ground-truth/A141_labeledMasks...     1.+18  /input/A144_L_labeledMasks.nii.gz       both
14       /opt/evaluation/ground-truth/A144_L_labeledMas...    -4.+18  /input/A144_M_labeledMasks.nii.gz       both
15       /opt/evaluation/ground-truth/A144_M_labeledMas...     7.+18    /input/A145_labeledMasks.nii.gz       both
16       /opt/evaluation/ground-truth/A145_labeledMasks...    -1.+18    /input/A146_labeledMasks.nii.gz       both
17       /opt/evaluation/ground-truth/A146_labeledMasks...    -3.+18    /input/A147_labeledMasks.nii.gz       both
18       /opt/evaluation/ground-truth/A147_labeledMasks...     6.+18    /input/A148_labeledMasks.nii.gz       both
19       /opt/evaluation/ground-truth/A148_labeledMasks...     4.+18    /input/A149_labeledMasks.nii.gz       both
20       /opt/evaluation/ground-truth/A149_labeledMasks...     8.+18    /input/A150_labeledMasks.nii.gz       both
21       /opt/evaluation/ground-truth/A150_labeledMasks...    -9.+18    /input/A151_labeledMasks.nii.gz       both
22       /opt/evaluation/ground-truth/A151_labeledMasks...              NaN                                NaN  left_only

Even if we didn't remove the .DS_store files you would have hit this problem with any submission from a non-OSX user.

Hi James,

I suspected an issue with .DS_store. This makes it clear to me now.
Thanks for your help, I really appreciate it!

James Meakin · Answer 6 · Thu Jun 18 2020 17:34:51 GMT+0800 (China Standard Time)

No problem! Best of luck with your challenge.