Possible to access other examples when prompting?
rbawden opened this issue · comments
Hi there!
I would like to know if it is possible when prompting to add access to the other examples in the dataset, for example by having access to a super() and then indexing the other examples.
The use case: I would like to construct a few-shot example directly in promptsource using a random example from elsewhere in the dataset. However, I would like the random example to use different attributes from the current example. This differs from the current way few-shot examples are created in eval-harness.
Illustration:
I am looking at the gsarti/flores-101 dataset, for which each example is multi-parallel, with one attribute per language, e.g.
{
"id":"int32"
"sentence_afr":"string"
"sentence_amh":"string"
"sentence_ara":"string"
"sentence_eng":"string
...
}
I would like to construct examples such that:
Arabic: {{ sentence_ara }} = English: ||| {{ sentence_eng }}
is the main template for the example, but the example used first (as 1-shot) is as follows:
French: {{ sentence_fra }} = English: ||| {{ sentence_amh }}
but where sentence_fra
and sentence_amh
in the second instance come from a different example.
Is this possible or it is something for eval-harness?
Hi @rbawden,
There is no explicit support unfortunately for the case you are describing.
I think the easiest thing you can do is separately prompt the dataset and glue the instances together yourself. Something along the lines of:
prompted_shot_dataset = dataset_to_select_shots_from.map(prompt)
for example in dataset_to_eval:
indexes = ... # list of indices of the shots to prepend
input = " ".join(prompted_shot_dataset[indexes]["input"]) + example["input"]
target = example["target"]
I haven't touched eval-harness in a while so won't be able to advise on this side.
Ok, thanks @VictorSanh, I'll look into this!