Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
Home Page:https://www.promptfoo.dev/
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool