Lightning-AI / litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

Home Page:https://lightning.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support for older models

qwenzo opened this issue · comments

Hello,

I was wondering if it is straightforward to bring older models such as GPT-2 to lit-gpt.
If so, what files/configs do I need to change?

Thank you!

Good point, and it should be. I use GPT-2 myself privately a lot as well, and it'd be nice to have it in LitGPT as well.

I think the architecture is similar to GPTNeo, so you can probably copy and adapt the GPTNeo config. The general todo list I use for adding new configs is:

  • Implement model download
  • Implement HF checkpoint conversion
  • Make sure generate.py produces reasonable outputs
  • Update model_download docs
  • Test pretraining
  • Test finetuning
    • Full finetuning
    • LoRA
    • Adapter + Adapter v2
  • Add tests