meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fine tuning for additional policies LlamaGuard

harindermashiana opened this issue · comments

Can you please share some details about fine-tuning LlamaGuard for additional categories?
Specifically, every time we need to add one additional category to the existing model, do we need to fine-tune on the old dataset(old categories) + new dataset(new category) or do we just need to fine-tune on the new dataset?

Can you please share a few bullet points/steps to do the fine-tuning and which files to reference based on your answer?

Thanks,

Hi, you can fine tune on just the new dataset with a low learning rate (2e-6 or even lower). For fine-tuning, any llama fine-tuning recipe (for example, from Hugging Face) should work. Please let us know if you have additional questions!