Nan values for certain classes while testing
ashnair1 opened this issue · comments
I tried out the Mask RCNN framework and observed pretty good results. But when I try to calculate the mAP score via the test_net.py script, most of my classes are outputting nan. Would you happen to know why this is the case? The detections that the models produce on the test images looks pretty good hence my confusion regarding this error.
INFO json_dataset_evaluator.py: 222: ~~~~ Mean and per-category AP @ IoU=[0.50,0.95] ~~~~
INFO json_dataset_evaluator.py: 223: 12.0
INFO json_dataset_evaluator.py: 231: 44.6
/usr/local/onnx/numpy/core/fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/usr/local/onnx/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 64.1
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 36.2
INFO json_dataset_evaluator.py: 231: 0.0
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 6.0
INFO json_dataset_evaluator.py: 231: 1.4
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 19.8
INFO json_dataset_evaluator.py: 231: 3.2
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 5.9
INFO json_dataset_evaluator.py: 231: 4.6
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 5.1
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 28.6
INFO json_dataset_evaluator.py: 231: 0.0
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 32.0
....................................................................
INFO json_dataset_evaluator.py: 232: ~~~~ Summary metrics ~~~~
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.120
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.180
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.129
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.068
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.207
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.059
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.135
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.166
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.082
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.318
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.246
Edit: To verify whether it was an issue with the code, I tried working with another dataset of mine where there was only one class excluding background. The evaluation works fine there as can be seen below:
INFO json_dataset_evaluator.py: 222: ~~~~ Mean and per-category AP @ IoU=[0.50,0.95] ~~~~
INFO json_dataset_evaluator.py: 223: 38.6
INFO json_dataset_evaluator.py: 231: 38.6
INFO json_dataset_evaluator.py: 232: ~~~~ Summary metrics ~~~~
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.386
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.647
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.420
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.341
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.607
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.761
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.011
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.092
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.428
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.381
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.652
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.799
The issue is I can't seem to understand the cause of the problem. If the problem was due to the dataset, it would have shown in the detection results on the test images. Training and Testing the same code with another dataset indicates the problem may not be due to the code. I would appreciate any insight into this problem.
System information
- Operating system: Ubuntu 16.04
- CUDA version: 10.0
- python version: 3.6.5
- pytorch version: 0.4.0
- numpy version: 1.14.0
This was due to absence of certain classes in my validation data and due to the extremely low number of occurrences of these classes. Most likely explanation is that due to the mentioned reasons TP and FP could have become equal to zero resulting in dividing by zero while calculating precision.