Possible error in Pythia-12B-deduped step 32000
smahdavi4 opened this issue · comments
Hi,
I was running some analysis on the intermediate checkpoints of Pythia-12B-deduped, and the checkpoint containing step 32000 is quite an outlier, which performs as strongly as the last checkpoints. I have attached an example screenshot which shows the accuracy of a task I am evaluating across model steps. This behavior happens across several different tasks I am testing. It somehow seems that this checkpoint might have been replaced with some other checkpoint (?). Could you please check this? Thanks.
Hi, thanks for raising this issue! I'm not sure what went wrong here, but I've uploaded the checkpoint again--testing its performance on LAMBADA now.
Thanks for reuploading the checkpoint! The reuploaded one has a reasonable performance now.