coallaoh / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Home Page:https://crfm.stanford.edu/helm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Welcome! This repository contains all the assets for Holistic Evaluation of Language Models, which includes the following features:

  • Collection of datasets in a standard format (e.g., NaturalQuestions)
  • Collection of models accessible via a unified API (e.g., GPT-3, MT-NLG, OPT, BLOOM)
  • Collection of metrics beyond accuracy (efficiency, bias, toxicity, etc.)
  • Collection of perturbations for evaluating robustness and fairness (e.g., typos, dialect)
  • Modular framework for constructing prompts from datasets
  • Proxy server for managing accounts and providing unified interface to access models

To read more:

About

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

https://crfm.stanford.edu/helm

License:Apache License 2.0


Languages

Language:Python 94.3%Language:JavaScript 4.6%Language:HTML 0.7%Language:Shell 0.2%Language:CSS 0.2%