bigscience-workshop / promptsource

Toolkit for creating, sharing and using natural language prompts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get answer choices for the templates without fixed answer choices?

AkshitaB opened this issue · comments

What is the right way to get the correct answer choices that are dependent on example instance? I've tried to use template.get_answer_choices_list() but it returns [''].

Code snippet to reproduce:

from promptsource.templates import DatasetTemplates
hs_prompts = DatasetTemplates('hellaswag')
tmpl1 = hs_prompts['complete_first_then']
# load the huggingface dataset:
dataset = datasets.load_from_disk("hellaswag_complete_first_then_score_eval")  # downloaded in my case.
tmpl1.get_answer_choices_list(dataset['train'][0])

The dataset instance here is:

...
'inputs_pretokenized': 'Complete the description with an appropriate ending:\nFirst, then, the man writes over the snow covering the window of a car, and a woman wearing winter clothes smiles. Then, then ...\n\n(a) , the man adds wax to the windshield and cuts it.\n\n(b) , a person board a ski lift, while two men supporting the head of the person wearing winter clothes snow as the we girls sled.\n\n(c) , the man puts on a christmas coat, knitted with netting.\n\n(d) , the man continues removing the snow on his car.\n',
 'is_correct': False,
 'targets': [3, 6, 8, 388, 617, 7, 11935, 12, 8, 27988, 11, 8620, 34, 5, 1],
 'targets_pretokenized': ', the man adds wax to the windshield and cuts it.',
 'weight': 1.0}

I would have expected to get the 4 answer choices from the example input str, but perhaps I'm misunderstanding the expected input to the function?

Note: I know I can get the answer choices from the dataset itself. I specifically want to know if it's possible to get it from the template object.

Hi @AkshitaB, thanks for your question. That's what I would expect too.

I was not able to run your code snippet because I don't have a dataset called hellaswag_complete_first_then_score_eval (That looks like something cached to disk for evaluation? Can you say more about where it came from/what you're trying to do?)

But if I run

import datasets
from promptsource.templates import DatasetTemplates

hs_prompts = DatasetTemplates('hellaswag')
tmpl1 = hs_prompts['complete_first_then']
# load the huggingface dataset:
dataset = datasets.load_dataset("hellaswag")
print(tmpl1.get_answer_choices_list(dataset['train'][0]))

I get the output

Using custom data configuration default
Reusing dataset hellaswag (/Users/bach/.cache/huggingface/datasets/hellaswag/default/0.1.0/c8c5bc30147e6345a39bfabe8856829801d4db7beb0271e44021811974fac112)
100%|██████████| 3/3 [00:00<00:00, 449.17it/s]
[', the man adds wax to the windshield and cuts it.', ', a person board a ski lift, while two men supporting the head of the person wearing winter clothes snow as the we girls sled.', ', the man puts on a christmas coat, knitted with netting.', ', the man continues removing the snow on his car.']

Process finished with exit code 0

cc @craffel Is this maybe related to needing to do something different for eval? It seems like there's a mismatch between the format of the examples in the hellaswag dataset in HF versus in hellaswag_complete_first_then_score_eval.

Closing for now. Feel free to reopen if the above doesn't address your issue.