jackaduma / Alpaca-LoRA-RLHF-PyTorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

jackaduma/Alpaca-LoRA-RLHF-PyTorch Issues