Cifar experiment setting

Question

Cifar experiment setting

anthonytmh opened this issue 4 years ago · comments

anthonytmh commented 4 years ago

Thank you for sharing your interesting work. Would you mind clarify some of the Cifar experiment setting:

Does the Cifar result in the paper initialised using Imagenet pretrained? If so, how long do you trained the imagenet pretrained model?
Do we need to fix the random seed when generating Cifar-LT dataset since it uses np.random.shuffle()? If not, the dataset might not be exactly the same for different run?

Kaihua Tang · Answer 1 · Wed Sep 30 2020 12:02:16 GMT+0800 (China Standard Time)

Yes, it needs pretrained initialization, since I found BBN also requires pretrained model. It takes about 1 day with 8 P100 GPUs.
I adopted the cifar-lt generator from previous SOTA BBN, and they don't fix the random seed. But based on my experiments, as long as the imbalance ratio is fixed, picking different random samples won't cause significant changes to the results.

ZML · Answer 2 · Sat Oct 03 2020 11:12:32 GMT+0800 (China Standard Time)

@KaihuaTang 作者你好，我仔细看过BBN代码，他的代码并没有用预训练模型（至少cifar没有用），使用预训练模型一定程度是不公平的。其次并不能固定随机采样的种子，不同的采样方式得到的训练集，在验证集上的准确率差异至少有1.5个点（这个差异主要来自选择的小样本代表性及训练难度），多跑几次实验就发现，个人认为合理的实验方式是多次实验取平均值。

Kaihua Tang · Answer 3 · Sat Oct 03 2020 11:22:45 GMT+0800 (China Standard Time)

他default是load pretrained model 的

Kaihua Tang · Answer 4 · Sat Oct 03 2020 11:24:48 GMT+0800 (China Standard Time)

@KaihuaTang 作者你好，我仔细看过BBN代码，他的代码并没有用预训练模型（至少cifar没有用），使用预训练模型一定程度是不公平的。其次并不能固定随机采样的种子，不同的采样方式得到的训练集，在验证集上的准确率差异至少有1.5个点（这个差异主要来自选择的小样本代表性及训练难度），多跑几次实验就发现，个人认为合理的实验方式是多次实验取平均值。

跑多次的问题，这个CIFAR实验最开始是rebuttal时候加上的。当时没来得及多跑，事后我测试了多次，基本结果稳定，增加幅度不大。

ZML · Answer 5 · Sat Oct 03 2020 11:40:42 GMT+0800 (China Standard Time)

他default是load pretrained model 的

作者你好！你应该没有仔细看过BBN代码，可以看看对应yaml文件和default.py，以前的LDAM和effective sample均不会使用。

ZML · Answer 6 · Sat Oct 03 2020 11:43:47 GMT+0800 (China Standard Time)

@KaihuaTang 作者你好，我仔细看过BBN代码，他的代码并没有用预训练模型（至少cifar没有用），使用预训练模型一定程度是不公平的。其次并不能固定随机采样的种子，不同的采样方式得到的训练集，在验证集上的准确率差异至少有1.5个点（这个差异主要来自选择的小样本代表性及训练难度），多跑几次实验就发现，个人认为合理的实验方式是多次实验取平均值。

跑多次的问题，这个CIFAR实验最开始是rebuttal时候加上的。当时没来得及多跑，事后我测试了多次，基本结果稳定，增加幅度不大。
训练样本的选择必然会导致性能的差异。

Kaihua Tang · Answer 7 · Sat Oct 03 2020 11:47:56 GMT+0800 (China Standard Time)

他default是load pretrained model 的

作者你好！你应该没有仔细看过BBN代码，可以看看对应yaml文件和default.py，以前的LDAM和effective sample均不会使用。

我看一下去，当时是rebuttal紧急加的实验，所以我直接copy了backbone和dataloader没有细看他完整代码。如果有问题我再更新个没有pretraining的结果。关于训练样本的问题，我也顺便更新跑多次求均值后的结果。感谢你的提醒~

ZML · Answer 8 · Sat Oct 03 2020 12:18:27 GMT+0800 (China Standard Time)

作者你好，我还有个问题关于ImageNet-LT数据集，你的代码基于Decoupling这篇文章，为什么这个baseline就存在差异（原文是44.4%，你论文里是45.0%），是什么方法促使其涨点的？这个改进是不是你提出方法涨点的部分原因那？

Kaihua Tang · Answer 9 · Sat Oct 03 2020 13:05:53 GMT+0800 (China Standard Time)

作者你好，我还有个问题关于ImageNet-LT数据集，你的代码基于Decoupling这篇文章，为什么这个baseline就存在差异（原文是44.4%，你论文里是45.0%），是什么方法促使其涨点的？这个改进是不是你提出方法涨点的部分原因那？

唯一的区别是，我的所有baseline分类器nn.Linear的bias=False，他原始的baiseline bias=True。DotProductClassifier.py。bias=True的话baseline会更低点，我的相对提升也会更明显，不过既然都是long-tail分类了，baseline bias=False也是应该的

Kaihua Tang · Answer 10 · Sat Oct 03 2020 13:09:43 GMT+0800 (China Standard Time)

作者你好，我还有个问题关于ImageNet-LT数据集，你的代码基于Decoupling这篇文章，为什么这个baseline就存在差异（原文是44.4%，你论文里是45.0%），是什么方法促使其涨点的？这个改进是不是你提出方法涨点的部分原因那？

我的classification的optimizer，scheduler的超参都没改过（除了包含一些失败的idea的没删干净的失效代码，别的就只改动了我文章说的部分。）所以确实不是靠任何调参和trick提升的

Kaihua Tang · Answer 11 · Sat Oct 03 2020 20:07:31 GMT+0800 (China Standard Time)

@KaihuaTang 作者你好，我仔细看过BBN代码，他的代码并没有用预训练模型（至少cifar没有用），使用预训练模型一定程度是不公平的。其次并不能固定随机采样的种子，不同的采样方式得到的训练集，在验证集上的准确率差异至少有1.5个点（这个差异主要来自选择的小样本代表性及训练难度），多跑几次实验就发现，个人认为合理的实验方式是多次实验取平均值。

仔细比较了BBN的config文件之后，我移除了预训练，并将scheduler和num_epoch等设定和BBN做了统一。结果确实略有下降，不过依然超过BBN等之前的方法，最新的结果表和相关代码已经更新，这次我实验跑了两次取了均值。我已在最新的文档中添加了对你的感谢，并会在论文最终版中采用这个更公平的结果。

ZML · Answer 12 · Sun Oct 04 2020 16:14:52 GMT+0800 (China Standard Time)

@KaihuaTang 作者你好，我仔细看过BBN代码，他的代码并没有用预训练模型（至少cifar没有用），使用预训练模型一定程度是不公平的。其次并不能固定随机采样的种子，不同的采样方式得到的训练集，在验证集上的准确率差异至少有1.5个点（这个差异主要来自选择的小样本代表性及训练难度），多跑几次实验就发现，个人认为合理的实验方式是多次实验取平均值。

仔细比较了BBN的config文件之后，我移除了预训练，并将scheduler和num_epoch等设定和BBN做了统一。结果确实略有下降，不过依然超过BBN等之前的方法，最新的结果表和相关代码已经更新，这次我实验跑了两次取了均值。我已在最新的文档中添加了对你的感谢，并会在论文最终版中采用这个更公平的结果。

谢谢作者，文章的思路想法还是挺好的。

tzhxs · Answer 13 · Sun Oct 18 2020 15:44:22 GMT+0800 (China Standard Time)

Yes, it needs pretrained initialization, since I found BBN also requires pretrained model. It takes about 1 day with 8 P100 GPUs.

I adopted the cifar-lt generator from previous SOTA BBN, and they don't fix the random seed. But based on my experiments, as long as the imbalance ratio is fixed, picking different random samples won't cause significant changes to the results.

你好，请问imagenet-lt相关的实验是否使用了预训练模型

Kaihua Tang · Answer 14 · Sun Oct 18 2020 16:14:53 GMT+0800 (China Standard Time)

Yes, it needs pretrained initialization, since I found BBN also requires pretrained model. It takes about 1 day with 8 P100 GPUs.

I adopted the cifar-lt generator from previous SOTA BBN, and they don't fix the random seed. But based on my experiments, as long as the imbalance ratio is fixed, picking different random samples won't cause significant changes to the results.

你好，请问imagenet-lt相关的实验是否使用了预训练模型

imagenet-lt没有使用预训练

tzhxs · Answer 15 · Sun Oct 18 2020 16:16:00 GMT+0800 (China Standard Time)

Yes, it needs pretrained initialization, since I found BBN also requires pretrained model. It takes about 1 day with 8 P100 GPUs.

I adopted the cifar-lt generator from previous SOTA BBN, and they don't fix the random seed. But based on my experiments, as long as the imbalance ratio is fixed, picking different random samples won't cause significant changes to the results.

你好，请问imagenet-lt相关的实验是否使用了预训练模型

imagenet-lt没有使用预训练

好的，非常感谢