Request for Sampler analogue like `tf.keras.utils.image_dataset_from_directory`
HandcartCactus opened this issue · comments
tf keras.image_dataset_from_directory takes a directory structured like this:
main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
and returns a Dataset
object. I have a similar dataset and I'd love to be able to use a Sampler on it.
I've already got most of my own implementation but I'm working on optimizing some of the code, as I am running into memory usage vs disk read speed tradeoff concerns. So with permission, I'd love to also claim this issue so I can contribute what I've written, and maybe get someone more experienced to look it over.
Hi Ejjaffe,
PR #307 from AminHP just added something similar to this as a MultiShotFileSampler. This uses the same base in memory MultiShotMemorySampler bit accepts a custom function for loading the images. Here we assume that your x
examples are paths to the images and that the loading function will read and prepare the images.
NOTE: this may slow down batch generation depending on how much preprocessing you do within each call to load fn.
See here for the new sampler