visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: Invalid bounding boxes

dnth opened this issue · comments

What happened?

I ran the Roboflow notebook and found that fastdup flagged some of the bounding boxes invalid.

Here are a few of them

image

From my understanding, the boxes are flagged as invalid when the bounding boxes go beyond the image. In this dataset, all images are of size 416x416. Clearly none of the bounding boxes above go beyond the image size.

Why are they flagged as invalid boxes in fastdup?

What did you expect to see?

Why the boxes are flagged as invalid.

What version of fastdup were you runnning on?

1.38

What version of Python were you running on?

Python 3.10

Operating System

Ubuntu 22.04

Reproduction steps

Run the Roboflow notebook.

Relevant log output

No response

Attach a screenshot [Optional]

No response

Contact Details [Optional]

No response

@dnth please run with verbose=1 and send me the full trace thanks

Hi @dnth I see the folloing

2023-09-12 13:48:13 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/untitled-2-1-_jpg.rf.2cc0804ad0a8d457291b39d72b32ea1f.jpg 0 265 0 61, bounding box is too small, 9 10 skipping.
2023-09-12 13:48:15 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/IMG_1282_jpg.rf.e5d454a356160bd444b71e19c9350f72.jpg 0 368 0 47, bounding box is too small, 3 10 skipping.
2023-09-12 13:48:15 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/IMG_1282_jpg.rf.f669ecf808bd1721e724ef14c7ede206.jpg 395 406 395 9, bounding box is too small, 21 10 skipping.
2023-09-12 13:48:30 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Almond_13-Original-Size-_jpg.rf.61e38e0da9a2aa99b20ff801c44a47dd.jpg 409 285 409 40, bounding box is too small, 6 10 skipping.
2023-09-12 13:48:45 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/carrot_93_jpg.rf.a242faf4608f26c230f476b5feb9f27c.jpg 0 324 0 45, bounding box is too small, 2 10 skipping.
2023-09-12 13:49:16 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/pistachios-1570x1047_jpg.rf.9b4218a77ea0530cac6db0abb3491392.jpg 0 295 0 77, bounding box is too small, 8 10 skipping.
2023-09-12 13:49:32 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/cabbage-green-50lb-case_jpg.rf.9b8d2677a0a8c7c0d4ceb961cc91aaab.jpg 333 346 333 1, bounding box is too small, 2 10 skipping.
2023-09-12 13:49:32 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/cabbage-green-50lb-case_jpg.rf.5459cd9f2b964dc5e84fb29905f95353.jpg 361 384 361 1, bounding box is too small, 2 10 skipping.
2023-09-12 13:49:32 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/cabbage-green-50lb-case_jpg.rf.546e265a2b83cd03e7d5d7c98c2d51ef.jpg 28 331 28 2, bounding box is too small, 1 10 skipping.
2023-09-12 13:49:32 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/cabbage-green-50lb-case_jpg.rf.6887156a700058674a259ce271ea74fe.jpg 40 333 40 2, bounding box is too small, 1 10 skipping.
2023-09-12 13:49:33 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/cabbage-green-50lb-case_jpg.rf.f83f582b75b3ba9ed396004997edc7a2.jpg 364 92 364 2, bounding box is too small, 1 10 skipping.
2023-09-12 13:49:45 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Pomegranates-shutterstock_152360780_jpg.rf.8c5c9586aaa67739a99211a76fc21359.jpg 409 39 409 30, bounding box is too small, 6 10 skipping.
2023-09-12 13:49:47 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Pomegranates-shutterstock_152360780_jpg.rf.71c53b5feee5f6677bddb176380f2b6c.jpg 44 0 44 9, bounding box is too small, 30 10 skipping.
2023-09-12 13:50:17 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/OIP-24-_jpg.rf.23fc31cbd35c1de823b99968a19e6d2c.jpg 411 290 411 99, bounding box is too small, 4 10 skipping.
2023-09-12 13:50:19 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/90d098b212fc9b6d28ae069e150f8f75_jpg.rf.fadbb69263e413a39ed153caf9d23e0c.jpg 412 259 412 75, bounding box is too small, 4 10 skipping.
2023-09-12 13:50:39 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Potato_38_jpg.rf.2ba6902291f27de4faa2b53291f7a873.jpg 0 197 0 46, bounding box is too small, 9 10 skipping.
2023-09-12 13:50:39 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Potato_38_jpg.rf.3315d9c2535e1b2eb1e1b9045e4979f1.jpg 0 196 0 46, bounding box is too small, 7 10 skipping.
2023-09-12 13:50:54 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Single-Blackberries_28_jpg.rf.465b276617c3780db8a717ce58c49ba9.jpg 122 0 122 9, bounding box is too small, 38 10 skipping.
2023-09-12 13:50:55 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Egg-images_43_jpg.rf.28d73a7161b833d02ef40017086a9948.jpg 195 27 195 1, bounding box is too small, 1 10 skipping.
2023-09-12 13:50:56 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Egg-images_43_jpg.rf.0c240a42a1d5a122bf0df56c6737ce1f.jpg 195 394 195 1, bounding box is too small, 1 10 skipping.
2023-09-12 13:50:56 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Egg-images_43_jpg.rf.0d7f52ff01b12da4a0e1860cef3755fa.jpg 190 403 190 1, bounding box is too small, 1 10 skipping.
2023-09-12 13:50:57 [DEBUG] Found bad bounding box2 for image DASH-DIET-101-4/train/Egg-images_43_jpg.rf.76de2b114b9b114cfd8262ba6a9b6697.jpg 26 220 26 1, bounding box is too small, 1 10 skipping.
    ~/visual_database/cxx    guy_fix *15 ?325  grep "bounding box is too small" ~/Downloads/output.txt | wc                            ✔    20:56:02  
      22     521    4606

There are 22 bounding boxes where the bb is 10 or less. If you like to process all bounding boxes despite they are too small, you can run with augmentation_additive_margin=10.

p.s.
let me know if there is a bug with the invalid instances display.

If anyone encounters this issue in the future, it might be the case that the filename is too long. Try to shorten the filename and see if the issue persists.