Documentation: writing custom samplers compatible with multi GPU training
fteufel opened this issue Β· comments
π Documentation
Hi,
I'm trying to run distributed training with a custom sampler for the first time. The idea is rather simple (fixed budget for each class) and works fine in single GPU. When moving to multi GPU, unsurprisingly I get an error message, which tells me that I should subclass BatchSampler
.
TypeError: Lightning can't inject a (distributed) sampler into your batch sampler, because it doesn't subclass PyTorch's `BatchSampler`. To mitigate this, either follow the API of `BatchSampler` or set `Trainer(use_distributed_sampler=False)`. If you choose the latter, you will be responsible for handling the distributed sampling within your batch sampler.
It is my understanding that torch's BatchSampler
takes one (single-sample) Sampler
and samples from that repeatedly to fill up the batch size. Are there any guidelines for how samplers should be built to be compatible with the sampler injection? I can't seem to find it in the docs.
cc @Borda