MarcellusZhao

Hao Zhao's starred repositories

just-eval

A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

Language:PythonMIT7400

GPT-Fathom is an open-source and reproducible LLM evaluation suite, benchmarking 10+ leading open-source and closed-source LLMs as well as OpenAI's earlier models on 20+ curated benchmarks under aligned settings.

Language:PythonMIT35100

open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Apache-2.0736000

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonMIT37500

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.013315500

LVM

Language:PythonApache-2.0175000

BIG-bench

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Language:PythonApache-2.0283900

llm-foundry

LLM training code for Databricks foundation models

Language:PythonApache-2.0400000

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.0207300

open-instruct

Language:PythonApache-2.0121800

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0959300

why-weight-decay

Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]

Language:PythonNOASSERTION4800

AdaLoRA

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).

Language:PythonMIT26100

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION846000

sam-low-rank-features

Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]

Language:Jupyter Notebook2400

instagraph

Converts text input or URL into knowledge graph and displays

Language:PythonMIT345600

Flash-Attention-Softmax-N

CUDA and Triton implementations of Flash Attention with SoftmaxN.

Language:PythonGPL-3.06600

DomainBed

DomainBed is a suite to test domain generalization algorithms

Language:PythonMIT138800

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonApache-2.0773200

open-interpreter

A natural language interface for computers

Language:PythonAGPL-3.05253900

spawrious

Language:PythonCC0-1.02500

adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Language:Jupyter NotebookApache-2.0254500

awesome_lists

Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)

Language:PythonMIT142400

awesome-obsidian

🕶️ Awesome stuff for Obsidian

Language:CSSCC0-1.0671400

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT3662100

loss-landscapes

Approximating neural network loss landscapes in low-dimensional parameter subspaces for PyTorch

Language:PythonMIT29600

AlpacaDataCleaned

Alpaca dataset from Stanford, cleaned and curated

Language:PythonApache-2.0150300

awesome-source-free-test-time-adaptation

A curated list of papers in Test-time Adaptation, Test-time Training and Source-free Domain Adaptation

45800

gpt-llm-trainer

Language:Jupyter NotebookMIT392700

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT664800