TheDuckAI / arb

Advanced Reasoning Benchmark Dataset for LLMs

Home Page:https://advanced-reasoning-benchmark.netlify.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

duckai logo
Advanced Reasoning Benchmark

arXiv Lint Status

A DuckAI project in collaboration with the Georgia Institute of Technology, ETH Zürich, Nomos AI, Stanford University Center for Legal Informatics, and the Mila - Quebec AI Institute

Abstract

ARB is a novel benchmark dataset composed of advanced reasoning problems designed to evaluate LLMs on text comprehension and expert domain reasoning, presenting a more challenging test than prior benchmarks, featuring questions that test deeper knowledge of mathematics, physics, biology, chemistry, and law.

API Usage

Endpoint url: https://advanced-reasoning-benchmark.netlify.app/api/
The documentation for the complete REST API of the ARB dataset is here.

Copyright © 2023 DuckAI

About

Advanced Reasoning Benchmark Dataset for LLMs

https://advanced-reasoning-benchmark.netlify.app

License:MIT License


Languages

Language:TypeScript 74.7%Language:JavaScript 24.4%Language:CSS 0.9%Language:Shell 0.1%