[Feature Request] Metrics that require knowledge of input.
ciaranby opened this issue · comments
ciaranby commented
For generate_until output type tasks, I would like to be able to register metrics that are computed using the input prompt to the LLM as well as as the reference and prediction.
I have a niche use case for this but I think it would be useful more broadly. For example LLM-as-Judge style metrics would require the input prompt also.
Also happy to submit a PR if you can point me in the right direction.