allenai / WildBench

Benchmarking LLMs with Challenging Tasks from Real Users

Home Page:https://huggingface.co/spaces/allenai/WildBench

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

allenai/WildBench Watchers