moyix / fauxpilot

FauxPilot - an open-source GitHub Copilot server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EleutherAI

TechnologyClassroom opened this issue · comments

How easy would it be to swap out SalesForce CodeGen models with those from EleutherAI?

GPT-Neo, GPT-J, and GPT-NeoX models are also trained on GitHub.

GPT-J would be straightforward, since that's the same architecture as CodeGen. FasterTransformer has a guide here; you'd just need to convert the model:

https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gptj_guide.md#download-the-model

I'm not sure what the architecture of NeoX-20B is. If it's identical to GPT-J but bigger, it could also be converted to FT.

I don't expect GPT-J to be very good as a code assistant, though -- the CodeGen paper evaluated it (Table 3):

https://arxiv.org/pdf/2203.13474.pdf

And it's about the same size as CodeGen 6B, so you'd probably be better off just using that.

I saw SaleForce and assumed incorrectly that their models were nonfree. No reason to use EleutherAI in this case.