Advanced Reasoning Benchmark

A DuckAI project in collaboration with the Georgia Institute of Technology, ETH Zürich, Nomos AI, Stanford University Center for Legal Informatics, and the Mila - Quebec AI Institute

Abstract

ARB is a novel benchmark dataset composed of advanced reasoning problems designed to evaluate LLMs on text comprehension and expert domain reasoning, presenting a more challenging test than prior benchmarks, featuring questions that test deeper knowledge of mathematics, physics, biology, chemistry, and law.

API Usage

Endpoint url: https://advanced-reasoning-benchmark.netlify.app/api/
The documentation for the complete REST API of the ARB dataset is here.

About

Advanced Reasoning Benchmark Dataset for LLMs

https://advanced-reasoning-benchmark.netlify.app

benchmark dataset llm

MIT License

Languages

Language:TypeScript 74.7%Language:JavaScript 24.4%Language:CSS 0.9%Language:Shell 0.1%