Incorrect Evaluation Metrics for a Dataset Containing Images with No Text Instances
ShahJahanIshaq opened this issue · comments
Currently, the DetDataset
class skips images with no text instances when building the dataset. This means that a model will have an overestimated precision and ultimately f-score because false positive predictions for empty images are not made, which would otherwise be made if empty images were present in the test set. Thus, no images should be removed from the test set.
To fix this issue, we should modify the DetDataset
class to include images with no text instances in the dataset and then evaluate the model on the test sets.