Would dspy help the benchmark
davideuler opened this issue · comments
david l euler commented
I came across dspy in the early this month. It is super interesting. I wonder if dspy would help to evaluate the llm benchmak?
https://github.com/stanfordnlp/dspy
Maybe we can refactor it by DSPY.
Nicholas Carlini commented
This seems like a very nice project -- but much more general purpose than what I want for this. This thing is designed just for the purpose of this one evaluation which makes it quite a bit easier to build a test than what they've made.