pytorch / examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Home Page:https://pytorch.org/examples

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

question about drop_last=True on validation mode

DY112 opened this issue · comments

I don't know why this code use drop_last=True on validation mode.
Also, this code only uses batch_size dividable datas for calculating average top1,5 errors.
And then re-generate auxiliary validation data&dataloader for printing remaining logs.

Can anyone tell me why this code uses this method?

Hi @DY112 , there are many examples in this repo. Could you share a code pointer of the example you are talking about?

Oh sorry.
My question was about imagenet training code.

val_sampler = torch.utils.data.distributed.DistributedSampler(val_dataset, shuffle=False, drop_last=True)

@DY112 Good question! Because DistributedSampler would pad the last uncompleted batch to become a full batch by default, which leads to wrong validation metrics. To get the correct metrics, we can either 1) use single GPU to run validation(it's slow though) or 2) use DistributedSampler for all batches until the last batch and use auxiliary dataset + regular Dataloader for the last batch.

Read more in #980

Thank you for your kind and detail explanation @hudeven !!