Incorrect Evaluation Metrics for a Dataset Containing Images with No Text Instances

Question

Incorrect Evaluation Metrics for a Dataset Containing Images with No Text Instances

ShahJahanIshaq opened this issue a year ago · comments

Currently, the DetDataset class skips images with no text instances when building the dataset. This means that a model will have an overestimated precision and ultimately f-score because false positive predictions for empty images are not made, which would otherwise be made if empty images were present in the test set. Thus, no images should be removed from the test set.

To fix this issue, we should modify the DetDataset class to include images with no text instances in the dataset and then evaluate the model on the test sets.