albumentations-team / autoalbument

AutoML for image augmentation. AutoAlbument uses the Faster AutoAugment algorithm to find optimal augmentation policies. Documentation - https://albumentations.ai/docs/autoalbument/

Home Page:https://albumentations.ai/docs/autoalbument/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix examples and input shapes

jwitos opened this issue · comments

Hi, I noticed two issues with the docs / comments:

  1. PascalVOC example (https://albumentations.ai/docs/autoalbument/examples/pascal_voc/) is missing _target_: autoalbument.faster_autoaugment.models.SemanticSegmentationModel instruction within semantic_segmentation_model section. Search fails without that line.
  2. When you generate new dataset.py file, comments say that "mask should be a NumPy array with the shape [height, width, num_classes]" and "image should be a NumPy array with the shape [height, width, num_channels]". Meanwhile, it looks like channels should be first, i.e. [channels, height, width]. This was the only combination that works anyway.
    Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.

Thanks

Hey, @jwitos, thanks for the report!

  1. Yes, now docs are outdated. I am planning to rework them soon and automatically publish actual configs from the repo.
  2. Could you please provide an example of your dataset.py? AutoAlbument expects that images and masks returned by that dataset should have the shape [height, width, num_channels]. Then AutoAlbument will create a transformation function using this method. This function contains the ToTensorV2 transform from Albumentations. The purpose of that transformation is to change NumPy array dimensions from [height, width, num_channels] to [num_channels, height, width] and then convert it to a PyTorch Tensor (so basically convert a regular NumPy array with image or mask to a format expected by PyTorch). The dataset implementation should use that transform function for all images and mask that it returns (e.g., https://github.com/albumentations-team/autoalbument/blob/master/examples/pascal_voc/dataset.py#L86)

Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.

Yes, I will rephrase it, thanks. In fact, it is possible to use single-channel images, but then you need to define a custom model that works with those single-channel images. I am planning to document such an option.

  1. Fixed. The documentation at https://albumentations.ai/docs/autoalbument/examples/list/ now contains the latest version of configs from the repository.