bigscience-workshop / promptsource

Toolkit for creating, sharing and using natural language prompts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scripts For Evaluating T0 with Prompts

gabeorlanski opened this issue · comments

Hello, are there publicly available scripts for how the evaluation was done for the fixed choice tasks in the T0 Paper?

I tried implementing them myself but have been getting different results for the log probability ranking and wanted to verify that I implemented it correctly. If the scripts are not publicly available, are there any instructions and/or similar projects that have a similar implementation?

Hi @gabeorlanski
Our training and evaluation codebase is mainly based on the t5 codebase (https://github.com/google-research/text-to-text-transfer-transformer). For the bigbench numbers, we relied on the original bigbench repository (mainly converting the checkpoints to pt format)

@gabeorlanski here are the official scripts to reproduce the main results: https://github.com/bigscience-workshop/t-zero -> section "Evaluation"