Lightning-Universe / lightning-flash

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains

Home Page:https://lightning-flash.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ObjectDetectionData.from_coco`: `transform_kwargs` and `image_size error`

just-eoghan opened this issue Β· comments

πŸ› Bug

Using ObjectDetectionData.from_coco with the argument transform_kwargs=dict(image_size=(256, 256)) results in an error when training for some time.

Without the transform training does not error.

I am using custom datasets but these don't have any issues with labels or images. I have checked all images and labels for correctness.

Seems to be an intermittent issue (will happen but varies on what epoch) so it is hard to replicate.

I would be interested to know if anyone else has experienced this issue?

Error

cv2.error: OpenCV(4.5.4) /tmp/pip-req-build-khv2fx3p/opencv/modules/imgproc/src/resize.cpp:4051: error: (-215:Assertion failed) !ssize.empty() in function 'resize' 

To Reproduce

Run the code sample

Stack Trace

2023-02-06 21:42:04,482][pytorch_lightning.accelerators.cuda][INFO] - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

Trainable params: 3.8 M
Non-trainable params: 0
Total params: 3.8 M
Total estimated model params size (MB): 15
Epoch 0/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:03 β€’ 0:00:49 6.09it/s loss: 0.858 v_num: wq06 train_loss: 0.692 train_class_loss: 0.456 train_box_loss: 0.005 
Epoch 1/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:05 β€’ 0:00:51 5.79it/s loss: 0.684 v_num: wq06 train_loss: 0.829 train_class_loss: 0.508 train_box_loss: 0.006 
Epoch 2/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:05 β€’ 0:00:52 5.69it/s loss: 0.612 v_num: wq06 train_loss: 0.562 train_class_loss: 0.349 train_box_loss: 0.004 
Epoch 3/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:06 β€’ 0:00:53 5.62it/s loss: 0.593 v_num: wq06 train_loss: 0.556 train_class_loss: 0.327 train_box_loss: 0.005 
Epoch 4/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:05 β€’ 0:00:52 5.73it/s loss: 0.611 v_num: wq06 train_loss: 0.762 train_class_loss: 0.341 train_box_loss: 0.008 
Epoch 5/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:05 β€’ 0:00:51 5.80it/s loss: 0.579 v_num: wq06 train_loss: 0.49 train_class_loss: 0.332 train_box_loss: 0.003 
Epoch 6/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:04 β€’ 0:00:50 5.92it/s loss: 0.554 v_num: wq06 train_loss: 0.73 train_class_loss: 0.463 train_box_loss: 0.005 
Epoch 7/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:55 5.45it/s loss: 0.486 v_num: wq06 train_loss: 0.418 train_class_loss: 0.275 train_box_loss: 0.003 
Epoch 8/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:06 β€’ 0:00:52 5.69it/s loss: 0.507 v_num: wq06 train_loss: 0.473 train_class_loss: 0.306 train_box_loss: 0.003 
Epoch 9/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:04 β€’ 0:00:51 5.90it/s loss: 0.573 v_num: wq06 train_loss: 0.424 train_class_loss: 0.272 train_box_loss: 0.003 
Epoch 9/-2 ━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━ 242/687 0:00:39 β€’ 0:01:13 6.12it/s loss: 0.538 v_num: wq06 train_loss: 0.572 train_class_loss: 0.317 train_box_loss: 0.005 
Epoch 10/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:06 β€’ 0:00:52 5.70it/s loss: 0.499 v_num: wq06 train_loss: 0.457 train_class_loss: 0.29 train_box_loss: 0.003 
Epoch 11/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:06 β€’ 0:00:52 5.69it/s loss: 0.497 v_num: wq06 train_loss: 0.397 train_class_loss: 0.247 train_box_loss: 0.003 
Epoch 12/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:06 β€’ 0:00:52 5.68it/s loss: 0.594 v_num: wq06 train_loss: 0.418 train_class_loss: 0.27 train_box_loss: 0.003 
Epoch 13/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:08 β€’ 0:00:54 5.49it/s loss: 0.525 v_num: wq06 train_loss: 0.465 train_class_loss: 0.304 train_box_loss: 0.003 
Epoch 13/-2 ━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━ 419/687 0:01:35 β€’ 0:04:36 0.97it/s loss: 0.525 v_num: wq06 train_loss: 0.465 train_class_loss: 0.304 train_box_loss: 0.003 
Epoch 14/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 393/687 0:01:08 β€’ 0:00:54 5.45it/s loss: 0.489 v_num: wq06 train_loss: 0.512 train_class_loss: 0.35 train_box_loss: 0.003 
Epoch 15/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.73it/s loss: 0.503 v_num: wq06 train_loss: 0.362 train_class_loss: 0.254 train_box_loss: 0.002 
Epoch 16/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:05 β€’ 0:00:52 5.78it/s loss: 0.508 v_num: wq06 train_loss: 0.475 train_class_loss: 0.293 train_box_loss: 0.004 
Epoch 17/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:53 5.62it/s loss: 0.499 v_num: wq06 train_loss: 0.432 train_class_loss: 0.252 train_box_loss: 0.004 
Epoch 18/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:07 β€’ 0:00:52 5.76it/s loss: 0.475 v_num: wq06 train_loss: 0.471 train_class_loss: 0.298 train_box_loss: 0.003 
Epoch 19/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.54it/s loss: 0.486 v_num: wq06 train_loss: 0.534 train_class_loss: 0.349 train_box_loss: 0.004 
Epoch 20/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.76it/s loss: 0.465 v_num: wq06 train_loss: 0.402 train_class_loss: 0.282 train_box_loss: 0.002 
Epoch 21/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.77it/s loss: 0.463 v_num: wq06 train_loss: 0.446 train_class_loss: 0.262 train_box_loss: 0.004 
Epoch 22/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:53 5.63it/s loss: 0.482 v_num: wq06 train_loss: 0.474 train_class_loss: 0.309 train_box_loss: 0.003 
Epoch 23/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:05 β€’ 0:00:52 5.78it/s loss: 0.493 v_num: wq06 train_loss: 0.451 train_class_loss: 0.272 train_box_loss: 0.004 
Epoch 24/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:53 5.60it/s loss: 0.451 v_num: wq06 train_loss: 0.388 train_class_loss: 0.254 train_box_loss: 0.003 
Epoch 24/-2 ━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176/687 0:00:29 β€’ 0:01:25 6.03it/s loss: 0.458 v_num: wq06 train_loss: 0.508 train_class_loss: 0.303 train_box_loss: 0.004 
Epoch 25/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.76it/s loss: 0.452 v_num: wq06 train_loss: 0.43 train_class_loss: 0.276 train_box_loss: 0.003 
Epoch 26/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.78it/s loss: 0.453 v_num: wq06 train_loss: 0.377 train_class_loss: 0.247 train_box_loss: 0.003 
Epoch 27/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.78it/s loss: 0.472 v_num: wq06 train_loss: 0.417 train_class_loss: 0.265 train_box_loss: 0.003 
Epoch 28/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.70it/s loss: 0.459 v_num: wq06 train_loss: 0.439 train_class_loss: 0.27 train_box_loss: 0.003 
Epoch 29/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:53 5.58it/s loss: 0.474 v_num: wq06 train_loss: 0.585 train_class_loss: 0.346 train_box_loss: 0.005 
Epoch 30/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:12 β€’ 0:00:52 5.70it/s loss: 0.478 v_num: wq06 train_loss: 0.56 train_class_loss: 0.221 train_box_loss: 0.007 
Epoch 31/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.50it/s loss: 0.417 v_num: wq06 train_loss: 0.552 train_class_loss: 0.334 train_box_loss: 0.004 
Epoch 32/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.56it/s loss: 0.416 v_num: wq06 train_loss: 0.296 train_class_loss: 0.218 train_box_loss: 0.002 
Epoch 32/-2 ━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━ 337/687 0:00:56 β€’ 0:00:59 6.01it/s loss: 0.494 v_num: wq06 train_loss: 0.408 train_class_loss: 0.267 train_box_loss: 0.003 
Epoch 33/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.55it/s loss: 0.431 v_num: wq06 train_loss: 0.548 train_class_loss: 0.405 train_box_loss: 0.003 
Epoch 34/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 391/687 0:01:06 β€’ 0:00:52 5.72it/s loss: 0.413 v_num: wq06 train_loss: 0.454 train_class_loss: 0.297 train_box_loss: 0.003 
Epoch 35/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.52it/s loss: 0.439 v_num: wq06 train_loss: 0.314 train_class_loss: 0.194 train_box_loss: 0.002 
Epoch 36/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:07 β€’ 0:00:54 5.51it/s loss: 0.445 v_num: wq06 train_loss: 0.6 train_class_loss: 0.365 train_box_loss: 0.005 
Epoch 37/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:08 β€’ 0:00:55 5.45it/s loss: 0.432 v_num: wq06 train_loss: 0.374 train_class_loss: 0.239 train_box_loss: 0.003 
Epoch 38/-2 ━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━ 392/687 0:01:08 β€’ 0:00:54 5.50it/s loss: 0.423 v_num: wq06 train_loss: 0.531 train_class_loss: 0.312 train_box_loss: 0.004 
Epoch 39/-2 ━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━ 338/687 0:00:56 β€’ 0:00:59 5.97it/s loss: 0.468 v_num: wq06 train_loss: 0.434 train_class_loss: 0.257 train_box_loss: 0.004 

  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1205, in _run_train
    self.fit_loop.run()
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 187, in advance
    batch = next(data_fetcher)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 184, in __next__
    return self.fetching_function()
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 265, in fetching_function
    self._fetch_next_batch(self.dataloader_iter)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 280, in _fetch_next_batch
    batch = next(iterator)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/trainer/supporters.py", line 569, in __next__
    return self.request_next_batch(self.loader_iters)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/pytorch_lightning/trainer/supporters.py", line 581, in request_next_batch
    return apply_to_collection(loader_iters, Iterator, next)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 47, in apply_to_collection
    return function(data, *args, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    return self.collate_fn(data)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/flash/core/data/io/input_transform.py", line 801, in __call__
    transformed_samples = [self.per_sample_transform(sample, self.stage) for sample in list_samples]
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/flash/core/data/io/input_transform.py", line 801, in <listcomp>
    transformed_samples = [self.per_sample_transform(sample, self.stage) for sample in list_samples]
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/flash/core/data/io/input_transform.py", line 619, in _per_sample_transform
    return fn(sample)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/flash/core/integrations/icevision/transforms.py", line 280, in forward
    record = self.transform(record)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/icevision/tfms/transform.py", line 11, in __call__
    return self.apply(record)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/icevision/tfms/albumentations/albumentations_adapter.py", line 282, in apply
    self._albu_out = tfms(**self._albu_in)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/core/composition.py", line 191, in __call__
    data = t(force_apply=force_apply, **data)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/core/composition.py", line 341, in __call__
    return self.transforms[0](force_apply=True, **data)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 90, in __call__
    return self.apply_with_params(params, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 103, in apply_with_params
    res[key] = target_function(arg, **dict(params, **target_dependencies))
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/augmentations/crops/transforms.py", line 469, in apply
    return FGeometric.resize(crop, self.height, self.width, interpolation)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/augmentations/functional.py", line 70, in wrapped_function
    result = func(img, *args, **kwargs)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/augmentations/geometric/functional.py", line 269, in resize
    return resize_fn(img)
  File "/home/eoghan/miniconda3/envs/flash-detectors/lib/python3.8/site-packages/albumentations/augmentations/functional.py", line 189, in __process_fn
    img = process_fn(img, **kwargs)
cv2.error: OpenCV(4.5.4) /tmp/pip-req-build-khv2fx3p/opencv/modules/imgproc/src/resize.cpp:4051: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

Code sample

  datamodule = ObjectDetectionData.from_coco(
        # assume train,val specified
        transform_kwargs=dict(image_size=(256, 256)),
        batch_size=X
    )

  model = ObjectDetector(head="efficientdet", backbone="d0", num_classes=2, image_size=256)
  
  # 3. Create the trainer and fine-tune the model
  trainer = flash.Trainer(max_epochs=N)
  trainer.fit(model=model, datamodule=datamodule)

Expected behavior

Expect training to complete without throwing opencv resize error.

Environment

  • OS (e.g., Linux): Ubuntu 20.04.5 LTS
  • Python version: 3.8.16
  • PyTorch/Lightning/Flash Version: 1.10.0/1.90/0.8.1.post0
  • Albumentations/CV2 Version: 1.3.0/4.5.5.64
  • GPU models and configuration: RTX 2080 Super

Seemingly this error was fixed by up-versioning opencv.

From

opencv-python-headless==4.5.4.60

To

opencv-python-headless==4.7.0.68

nice finding, could you pls send PR with adding this opencv-python requirement for image domain 🐿️