sungnyun / openssl-simcore

(CVPR 2023) Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset split in SSL and evaluation

caiocj1 opened this issue · comments

Hello, thanks for the great work!

Just wanted to check something:

  1. In the appendix it's said that "we incorporate the train and validation set of the original benchmarks [...] to enlarge the number of training samples".
  2. In the paper itself, Table 2 shows the size of the training set.

So when doing SSL and then evaluating just on aircraft for example (the 46.56% value), you use all 10k samples in SSL, and in table 2 you report the accuracy on the training set after training the linear classifier? Thanks in advance!

Hi @caiocj1.
Thanks for your interest in our paper!

Only if a fine-grained dataset has three splits (train, val, test), we combined train + val -> train dataset.
We run SSL on train dataset (for SimCore: train_X + Coreset), and then evaluate on test dataset (or val dataset if fine-grained dataset has two split: train, val).

For example, in case of Aircraft, it has three splits, so we combine them to (train+val: 6,667 , test: 3,333).
We run SSL pretraining on 6,667 samples, and train a linear classifier on the same 6,667 samples.
A trained classifier is linear evaluated on 3,333 test samples.

Feel free to ask if you have more questions on our work!
Thanks.

Feel free to ask if further needs exist :)

If you have any further questions, please reopen the issue or open new one.
Thanks again for your interests!!