ufoym / imbalanced-dataset-sampler

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

better description of "epoch"

apple2373 opened this issue · comments

Thanks for the good implementation! I was a bit confused about the description so I'd like to comment.

Then in each epoch, the loader will sample the entire dataset and weigh your samples inversely to your class appearing probability.

But this is no longer what we call an epoch normally, right? I mean, we do not iterate over all data points in an epoch, because same data points that belongs to majority classes are not used in an epoch. Technically, for each "epoch" defined by pytorch, the loader will sample the same number of data points in the original dataset, and each sample is picked with the probability disproportional to the class frequency.