promptfoo / promptfoo

Test your prompts, agents, and RAGs. Redteaming, pentesting, vulnerability scanning for LLMs. Improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Home Page:https://www.promptfoo.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Aggregate and/or Calculated Metrics

thomascleberg opened this issue · comments

It would be fantastic to be able to aggregate and/or apply logic over Metrics in a custom way.

For example, if you had a requirement that context be both relevant and faithful, it would be nice to be able to implement context-relevant, context-faithful , (context-relevant > n AND context-faithful > m), context-relevant * context-faithful et cetera.

Thanks for the suggestion @thomascleberg. Definitely interested in implementing this - will follow up once I open a PR