Open LLMs
These LLMs are all licensed for commercial use (e.g., Apache 2.0). Contributions and corrections welcome!
Language Model |
Checkpoints |
Paper/Blog |
Size |
Licence |
T5 |
T5 & Flan-T5, Flan-T5-xxl (HF) |
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer |
60M - 11B |
Apache 2.0 |
UL2 |
UL2 & Flan-UL2, Flan-UL2 (HF) |
UL2 20B: An Open Source Unified Language Learner |
20B |
Apache 2.0 |
Cerebras-GPT |
Cerebras-GPT |
Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models, Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster |
111M - 13B |
Apache 2.0 |
Pythia |
pythia 70M - 12B |
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling |
70M - 12B |
Apache 2.0 |
Dolly |
dolly-v2-12b |
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM |
3B, 7B, 12B |
MIT |
RWKV |
RWKV, ChatRWKV |
The RWKV Language Model (and my LM tricks) |
100M - 14B |
Apache 2.0 |
GPT-J-6B |
GPT-J-6B, GPT4All-J |
GPT-J-6B: 6B JAX-Based Transformer |
6B |
Apache 2.0 |
StableLM |
StableLM |
Stability AI Launches the First of its StableLM Suite of Language Models |
3B - 65B |
CC BY-SA-4.0 license |
Replit Code |
replit-code-v1-3b |
Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit |
2.7B |
Creative Commons license (CC BY-SA-4.0) |
StarCoder |
starcoder |
StarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you! |
15B |
BigCode OpenRAIL-M v1 |
MPT-7B |
MPT-7B base, instruct, etc |
Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs |
7B |
Apache 2.0 for base and storywriter |
Want to contribute? Just add to the above with the following
- Name of model
- Checkpoints:
- Paper/blog:
- Size:
- Licence:
Improvements
- Add context size?
- Add (links to) eval benchmarks?