meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Evaluation script released?

hxhcreate opened this issue · comments

commented

I was following this work.
It would be greatly appreciated if you could release the evaluation code to help us reproduce your results!

Hi there, could you please provide more information, such as whether your question is about Llama Guard or CyberSecEval, and what exact script you are looking for? Thanks.

Which script are you looking for?

Hi, if you're looking for Llama Guard evaluation, Llama recipes has a script for running inference. We then use sklearn's precision_score, recall_score, f1_score, average_precision_score to compute the metrics. Is this what you're looking for?

I will close this issue, but please reopen if you have further questions. Thanks!