Generative Artificial Intelligence Research Lab (GAIR)'s repositories
Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
OlympicArena
This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"
MetaCritique
Evaluate the Quality of Critique
ReasonEval
Evaluating Mathematical Reasoning Beyond Accuracy
SimulateBench
GPT as Human