prometheus-eval's repositories
prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
prometheus
[ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically designed for fine-grained evaluation on a customized score rubric, Prometheus is a good alternative for human evaluation and GPT-4 evaluation.
prometheus-vision
[ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically designed for fine-grained evaluation on customized score rubric, Prometheus-Vision is a good alternative for human evaluation and GPT-4V evaluation.
scaling-evaluation-compute
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
leaderboard
BiGGen-Bench Leaderboard
prometheus-eval.github.io
Documentation and blogposts for Prometheus