The helm|piqa task is generative but has generation_size=-1.
yonatano opened this issue · comments
The helm|piqa
task listed in tasks_table.jsonl
here: https://github.com/huggingface/lighteval/blob/a98210fd3a2d1e8bface1c32b72ebd5017173a4c/src/lighteval/tasks/tasks_table.jsonl#L797C1-L797C472.
has "generation_size":-1
even though "metric":["exact_match"...
which are mutually exclusive.
For example, this command fails for me --
accelerate launch --multi_gpu --num_processes=8 run_evals_accelerate.py \
--model_args "pretrained=gpt2" \
--tasks "helm|piqa|0|1" \
--override_batch_size 1 \
--output_dir="./evals/"
with error:
ValueError: `max_new_tokens` must be greater than 0, but is -1.
Thanks.
Hi! This sounds like an error on our side! If you have the time, could you take a look at the helm code base to see which generation size should be used?