milesial / Pytorch-UNet

PyTorch implementation of the U-Net for image semantic segmentation with high quality images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

treat "glob" as an expensive operation ( Pytorch-UNet/utils/data_loading.py )

sb22bs opened this issue · comments

commented

We have seen on our setup that it is doing lookups for directory entries all the time,
so I suspect, that this happens within the data_loading.py.

Depending on the filesystem this can be a very expensive operation
(considering network-filesystems, there might be new files at any point in time).

So....I would suggest:
Do the lookup once (when the program starts), and then just walk through the list of entries one has collected (which can
be all done in user-space and no context-switches are necessary).

At least when the files are on some network-filesystems the performance might benefit dramatically by this change, local filesystems might also benefit from it.