ufoym / imbalanced-dataset-sampler

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could you explain your way of sampling in details?

marsggbo opened this issue · comments

Thanks so much for your implementation. But I have several questions:

  1. In the below picture, it seems that the class with less numbers is sampled repeatedly, while the class with more numbers is sub-sampled. So I wonder what's the difference between your method and traditional method?

image

  1. In each epoch, does each image is sampled for only once? Because you mentioned that your method avoids of creating a new balanced dataset,

I have the same question. Actually, are you doing over/under-sampling or just re-weighting the samples in the training stage? Is there a paper that I can refer to?

Looks like this project is abandoned

commented

According to

class ImbalancedDatasetSampler(torch.utils.data.sampler.Sampler):

select samples based on the weight of imbalanced data, and replaceable.

commented

the labels of each batch is imbalanced, but the distribution of labels in one epoch is balanced