Improve representation of evaluationResults so we can build automated leaderboards

Question

Improve representation of evaluationResults so we can build automated leaderboards

dgarijo opened this issue 7 months ago · comments

Representation results are text. We should build on model cards and propose something a bit more structured so we can compare model evaluation outputs.