Add Support for PEFT fine-tuning

Question

PierpaoloSorbellini opened this issue 2 years ago · comments

Description

Supports for PEFT for chatllama models and trainings

Add PEFT to Enable Parameter efficient fine-tuning in actor, reward and critic models.
Check RLHF stability.
Check quality of adapted models vs. fully trained vs. reduced training requirements.
Update README.

Mark Schmidt · Answer 1 · Fri Mar 24 2023 20:24:36 GMT+0800 (China Standard Time)

Ideally 4bit PEFT like https://github.com/johnsmith0031/alpaca_lora_4bit, as this enables training 33b (30b) in under 24GB of VRAM on consumer cards like a single 3090.