tangzhy's starred repositories
human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
torchscale
Foundation Architecture for (M)LLMs
helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
evaluating-verifiability-in-generative-search-engines
Companion repo for "Evaluating Verifiability in Generative Search Engines".
Pre-Trained-Language-Models-for-Interactive-Decision-Making
Pre-Trained Language Models for Interactive Decision-Making [NeurIPS 2022]
bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
ReasoningNLP
paper list on reasoning in NLP
alpaca-lora
Instruct-tune LLaMA on consumer hardware
flash-attention
Fast and memory-efficient exact attention