Impact of new Eos token id on human eval dataset
amd-1221 opened this issue · comments
if Eos token id is changed from 2 to 50256, accuracy on eval dataset will also get impacted, If true then what about paper mentioned accuracy on human eval dataset?
For the HumanEval benchmark execution, the tokenizer is instantiated explicitly, so that the model configuration file has no effect. See,
CodeGen/jaxformer/hf/sample.py
Line 84 in c483074
Thank you for the consideration!