The helm|piqa task is generative but has generation_size=-1.

Question

The helm|piqa task is generative but has generation_size=-1.

yonatano opened this issue 18 days ago · comments

The helm|piqa task listed in tasks_table.jsonl here: https://github.com/huggingface/lighteval/blob/a98210fd3a2d1e8bface1c32b72ebd5017173a4c/src/lighteval/tasks/tasks_table.jsonl#L797C1-L797C472.

has "generation_size":-1 even though "metric":["exact_match"... which are mutually exclusive.

For example, this command fails for me --

accelerate launch --multi_gpu --num_processes=8 run_evals_accelerate.py \
    --model_args "pretrained=gpt2" \
    --tasks "helm|piqa|0|1" \
    --override_batch_size 1 \
    --output_dir="./evals/"

with error:

ValueError: `max_new_tokens` must be greater than 0, but is -1.

Thanks.

Clémentine Fourrier · Answer 1 · Fri May 31 2024 19:58:32 GMT+0800 (China Standard Time)

Hi! This sounds like an error on our side! If you have the time, could you take a look at the helm code base to see which generation size should be used?