kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Finetuning Hardware Recomendations

greyweb opened this issue · comments

Hi,
I am trying to finetune GPT-J 6B from HF converted weights. It would be great to know some recommendations on the finetuning compute widely used/ suggested for GPT-J 6B.