Puzzles about the number of train data
GumpW opened this issue · comments
Thanks for sharing the code, I have learned a lot from it. But I am puzzled about the train data.
for i in range(n_way*num_shots):
batches_xi.append(np.zeros((batch_size, self.input_channels, self.size[0], self.size[1]), dtype='float32'))
labels_yi.append(np.zeros((batch_size, n_way), dtype='float32'))
oracles_yi.append(np.zeros((batch_size, n_way), dtype='float32'))
In the code above, when we increase the batch_size, will the total total train data we used for training be increased?Since the iteration number is setted before training and the size of batches_xi is (n_way*num_shots,batch_size,input_channels,size[0],size[1]).
Hi,
I will give two answers since it depends on what it is understood as the total amount of training data.
In a N-way k-shot setting, each episode is created by sampling from a larger dataset. For example, mini-imagenet has 64 training classes with 600 samples per class. In a 5-way 5-shot setting we will randomly sample 5 classes from the 64 training classes and 5 samples for each one of these classes.
Therefore, if we consider the training partition of mini-imagenet (64 classes, 600 samples per class) as the training data, we are not increasing the training data when increasing the batch_size.
On the other hand, you may consider the number of episodes as the training data. The number training episodes is num_iterations*batch_size, hence, given a fix number of iterations, the number of episodes will increase when increasing the batch_size. If you want to increase the batch_size without increasing the number of episodes, you can reduce the number of iterations by the same factor.
Thank you for your answer.
So in a few-shot learning task, the "few-shot" means the training data of every iteration is few, not the data of the training partition of the dataset is few. Am I right?
Exactly