princeton-nlp / LM-BFF

[ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training sample template conversion

v1nc3nt27 opened this issue · comments

Hello,

first of all, thanks for your work and for publishing the code. I'm curious about one thing:
If I understand your paper correctly, you use the prompts with demonstration also while training. I.e., I expected to see the training samples also augmented by the demonstrations. However, I cannot find this in the code. In the dataset.py you write If it is not training, we pre-process the data; otherwise, we process the data online. Perhaps I misinterpret this, but then I'd wonder why the training_samples would even get the context_indicesadded with the respective training-sample-id left out.

Could you please point me to the location in the code where you build the templates for the training samples as you do for the dev and test samples? Thanks a lot!

Hi,

For demonstrations, we also "pre-process" the context indexes for training (candidates of demonstrations for each input). We just don't decide which exact demonstrations to use for training beforehand (to add randomness during training).

Hey,

thanks for your response. I understand, this makes sense. However, I cannot find it in the code. It should be somewhere in the trainer, right? I just cannot find it, it looks like the training samples are used without the demonstrations. Could you please point me to the position in the code where the training samples are extended with the demonstration template? That would really help me a lot. Thanks!

EDIT: I have found it in the dataset class FewShotDataset. Thanks!