Lyken17 / Efficient-PyTorch

My best practice of training large dataset using PyTorch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

add attributes like imgs and classes in ImageFolder()

flyingmrwang opened this issue · comments

This seems really helpful to improve IO for pytorch dataset. But I also notice the return value is not totally in the same format as imageFolder() when simply replace in training script. Do you have plan to make it available for output of your ImageFolderLMDB like following?

    classes (list): List of the class names.
    class_to_idx (dict): Dict with items (class_name, class_index).
    imgs (list): List of (image path, class_index) tuples

https://pytorch.org/docs/stable/_modules/torchvision/datasets/folder.html#ImageFolder

You mean add documentation? It is a good suggestion but currently I do not have the bandwidth to do since I am traveling. It would be great if you can submit a PR.

No, I am just replacing my ImageFolder with ImageFolderLMDB. While my upcomming dataLoader requires sampler, which will analyze the distribution of classes. The thing is original ImageFolder provide output like the the whole list besides iterator, while yours only provide iterator. So I am just wondering if it is possible to add them. I already find my lazy way to make it merged into my pipeline, so just a suggestion to improve if you have time in the future.