Add Support for PEFT fine-tuning
PierpaoloSorbellini opened this issue · comments
Description
Supports for PEFT for chatllama models and trainings
TODO
- Add PEFT to Enable Parameter efficient fine-tuning in actor, reward and critic models.
- Check RLHF stability.
- Check quality of adapted models vs. fully trained vs. reduced training requirements.
- Update README.
Ideally 4bit PEFT like https://github.com/johnsmith0031/alpaca_lora_4bit, as this enables training 33b (30b) in under 24GB of VRAM on consumer cards like a single 3090.