Zewen Chi's repositories
bitsandbytes-aarch64
aarch64 for bitsandbytes
accelerate
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
adapter-transformers
Huggingface Transformers + Adapters = ❤️
alpaca-lora
Instruct-tune LLaMA on consumer hardware
awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
cloudflare-scrape
A Python module to bypass Cloudflare's anti-bot page.
crosslingual_winograd
"It's All in the Heads" (Findings of ACL 2021), official implementation and data
DeepSpeedExamples
Example models using DeepSpeed
DialoGPT
Large-scale pretraining for dialogue
improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
llama.cpp
Port of Facebook's LLaMA model in C/C++
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Neural-Collapse
[NeurIPS 2021] A Geometric Analysis of Neural Collapse with Unconstrained Features
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
PrefixTuning
Prefix-Tuning: Optimizing Continuous Prompts for Generation
RL4LMs
A modular RL library to fine-tune language models to human preferences
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)