Sharing the 1.3B-Pile@300B model
BlinkDL opened this issue · comments
PENG Bo commented
The 1.3B-Pile@300B model is quite strong:
https://docs.google.com/spreadsheets/d/1CI8Q9RCblLRzUOPJ6ViqBmo284-8ojluQ-CmaEuhuv0/edit#gid=1295801165
lambada 0.6088 piqa 0.7160 hellaswag 0.5209 --> these are all better than gpt-neo 1.3B.
Could you share the model? Thank you.