Paper checklist
leogao2 opened this issue · comments
Leo Gao commented
A checklist for things we need to get done on the paper, prioritized.
Must do:
- Train 6B on Pile and report Perplexity (@sdtblck )
- Datasheets (@StellaAthena )
- Set up webpage (see #69)
- Write up announcement blog post (@leogao2 )
- Transfer Pile to The Eye (@leogao2 )
- Implement in HF transformers (@leogao2 )
- Finish writing up the paper (everyone)
Nice to have:
- Perform profanity analysis (@anishthite )
- Perform language analysis (@leogao2 )
- Perform topic analysis (@cfoster0 )
- Perform n-gram analysis (@researcher2 )
- Report GPT-3 (and other pretrained models) Pile validation Perplexity (@zphang )
- Perform other analyses we think of
- Train 1.5B on Pile and report Pile validation Perplexity (@sdtblck / @anishthite )
- Train 117M on Pile and report Pile validation Perplexity (@sdtblck / @anishthite )
- Report evaluation score of {6B, 1.5B, 117M} trained on Pile on as many evaluations as possible (implemented in lm_eval_harness)
- Design a logo for Pile
Wishlist:
- Train 6B on CC and report Pile validation Perplexity (@sdtblck )
- Train 1.5B on CC and report Pile validation Perplexity (@sdtblck / @anishthite )
- Train 117M on CC and report Pile validation Perplexity (@sdtblck / @anishthite )
- Report evaluation score of {6B, 1.5B, 117M} trained on CC on as many evaluations as possible (implemented in lm_eval_harness)