EleutherAI
TechnologyClassroom opened this issue · comments
How easy would it be to swap out SalesForce CodeGen models with those from EleutherAI?
GPT-Neo, GPT-J, and GPT-NeoX models are also trained on GitHub.
GPT-J would be straightforward, since that's the same architecture as CodeGen. FasterTransformer has a guide here; you'd just need to convert the model:
https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gptj_guide.md#download-the-model
I'm not sure what the architecture of NeoX-20B is. If it's identical to GPT-J but bigger, it could also be converted to FT.
I don't expect GPT-J to be very good as a code assistant, though -- the CodeGen paper evaluated it (Table 3):
https://arxiv.org/pdf/2203.13474.pdf
And it's about the same size as CodeGen 6B, so you'd probably be better off just using that.
I saw SaleForce and assumed incorrectly that their models were nonfree. No reason to use EleutherAI in this case.