Code and dataset for the paper: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset (https://arxiv.org/pdf/2406.02106).
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool