TechnologyClassroom opened this issue · comments
GPT-J would be straightforward, since that's the same architecture as CodeGen. FasterTransformer has a guide here; you'd just need to convert the model:
I'm not sure what the architecture of NeoX-20B is. If it's identical to GPT-J but bigger, it could also be converted to FT.
I don't expect GPT-J to be very good as a code assistant, though -- the CodeGen paper evaluated it (Table 3):
And it's about the same size as CodeGen 6B, so you'd probably be better off just using that.
I saw SaleForce and assumed incorrectly that their models were nonfree. No reason to use EleutherAI in this case.