paperswithcode / sotabench-api

Easily benchmark Machine Learning models on selected tasks and datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TODO from discussion

rstojnic opened this issue · comments

TODO sotabench lib:

  • remove benchmark() function from benchmark.py
  • move deps to requirements
  • evaluation.json should be made if some ENV variable is set, otherwise pprint something
  • for each benchmark:
    • benchmark()
    • default transform
    • the dataset
    • default parameters
  • documentation:
    • dataset examples
    • default transform example
    • input fed to model, and expected output
    • link to examples of benchmarked models
  • a library of transforms (maybe)

And additional requests:

  • BenchmarkResult return value should also contain: 1) the dataset used, 2) the transform used, 3) input parameters used when invoking the function, 4) anything else - so it's a self-contained record of results