2017-06 |
Transformers |
Google |
Attention Is All You Need |
NeurIPS ![Dynamic JSON Badge](https://camo.githubusercontent.com/1d4a603eb014a7a8efdda9daf4c35e065b33c304db7822222227ee64fc68a15c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246323034653330373338373066616533643035626362633266366138653236336439623732653737362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2018-06 |
GPT 1.0 |
OpenAI |
Improving Language Understanding by Generative Pre-Training |
![Dynamic JSON Badge](https://camo.githubusercontent.com/df6540728e7f375751740dcb6f07b57233cd5e74e185cebcc40079b5e23f66db/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246636431383830306130666530623636386131636331396632656339356235303033643061353033352533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2018-10 |
BERT |
Google |
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding |
NAACL
![Dynamic JSON Badge](https://camo.githubusercontent.com/bb84f4eee897eda1e756f14f6582b4208874f3db3cdbb0c77bf1c95480582a46/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246646632623065323664303539396365336537306466386139646130326535313539346530653939322533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2019-02 |
GPT 2.0 |
OpenAI |
Language Models are Unsupervised Multitask Learners |
![Dynamic JSON Badge](https://camo.githubusercontent.com/0f631351065cd5910e3fbe214d9ca613e9ff2c95da5aa262e08a67ebdda15e5b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246393430356363306436313639393838333731623237353565353733636332383635306431346466652533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2019-09 |
Megatron-LM |
NVIDIA |
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism |
![Dynamic JSON Badge](https://camo.githubusercontent.com/b83bf8a4af59907dd4dd808c0ec37da27f5ba5b1819cb3c7491302af61a90812/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246383332336335393165313139656230396232386232396664366337626337366264383839646637612533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2019-10 |
T5 |
Google |
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer |
JMLR ![Dynamic JSON Badge](https://camo.githubusercontent.com/7a26d6a253916464d47285b53732a1ba3c63e5b22385b0ae3d8336b6591eb10c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246336366623331393638396630366266303463326532383339393336316634313463613332633462332533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2019-10 |
ZeRO |
Microsoft |
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models |
SC ![Dynamic JSON Badge](https://camo.githubusercontent.com/3b9a4abb72260ae6fe0485ba8bbf7fe4b3b2e1e6904a45db2284b17e91bec80d/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246303063393537373131623132343638636233383432346361636364663532393162623335343033332533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2020-01 |
Scaling Law |
OpenAI |
Scaling Laws for Neural Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/c891996e88d24e50c5af8610755ed78c723f699b841fe445365b3a59cac605c7/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246653663353631643032353030623235393661323330623334316138656238623932316361356266322533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2020-05 |
GPT 3.0 |
OpenAI |
Language models are few-shot learners |
NeurIPS ![Dynamic JSON Badge](https://camo.githubusercontent.com/b079e3b003811102bf50b89769d8cd56012dc0fb0f56a6e3dd8aa08d6260354a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246366238356236333537396139313666373035613865313061343962643864383439643931623166632533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-01 |
Switch Transformers |
Google |
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity |
JMLR ![Dynamic JSON Badge](https://camo.githubusercontent.com/083085ce1f3f0c65e86b41ec8680ffd4701913487b04fb01ebe181af7164dbde/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246666461636632613733326635356265666463343130656139323730393163616433623739316631332533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-08 |
Codex |
OpenAI |
Evaluating Large Language Models Trained on Code |
![Dynamic JSON Badge](https://camo.githubusercontent.com/6af9872f097eb504e763e6051b27bbe07559104eb4294dc05079b4b3f6a480d9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246616362646266343966396263336631353162393364396361396130363030396634663665623236392533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-08 |
Foundation Models |
Stanford |
On the Opportunities and Risks of Foundation Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/9bce96fa4d3e65b0bf34ec7b3446f3833cca84acb475eb10e04fe6ca539e5b46/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246346636386530376336633331373334383030353366643532333931383531643666383064363531622533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-09 |
FLAN |
Google |
Finetuned Language Models are Zero-Shot Learners |
ICLR
![Dynamic JSON Badge](https://camo.githubusercontent.com/4efa57ef0410f8e18480c615f542090933902c2f9a2b7902599d99ebed4eba61/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246666630623236383164376230356531366334366466623731643938306363326636303539303763642533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-10 |
T0 |
HuggingFace et al. |
Multitask Prompted Training Enables Zero-Shot Task Generalization |
ICLR
![Dynamic JSON Badge](https://camo.githubusercontent.com/24267f83c4e7e05d32b2a7bec72d17214538b91dbac34731922d635a5c320b36/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246313764643335353566643163636631313431636639383433343766613162336664366230303963612533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-12 |
GLaM |
Google |
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts |
ICML ![Dynamic JSON Badge](https://camo.githubusercontent.com/458088e4a5d685853663334851af53281df1366ba8f0c8e20135bf788c89e14c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246383064303131366437376265656465643063323363663438393436643964313064346661656531342533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-12 |
WebGPT |
OpenAI |
WebGPT: Browser-assisted question-answering with human feedback |
![Dynamic JSON Badge](https://camo.githubusercontent.com/6a631fc24b1d0a3283d9f1fab1641a7dfc897cb485ece0de13a76bcd3297eced/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246326633656665343430383361663931636566353632633161333435316565653266383630316432322533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-12 |
Retro |
DeepMind |
Improving language models by retrieving from trillions of tokens |
ICML ![Dynamic JSON Badge](https://camo.githubusercontent.com/7a3f0a4d0ced3c184ab1e2c76370845ebec231b5caa6864173e3cb2ddba42f17/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246303032633235366433306436626534623233643336356138646538616530653637653463393634312533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2021-12 |
Gopher |
DeepMind |
Scaling Language Models: Methods, Analysis & Insights from Training Gopher |
![Dynamic JSON Badge](https://camo.githubusercontent.com/dd47283ca9474da837700c5d1788b82741bb9c6179865e2b2d53a341c747292c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246363866313431373234383134383339643535366139383936343631393462653838363431623134332533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-01 |
COT |
Google |
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models |
NeurIPS
![Dynamic JSON Badge](https://camo.githubusercontent.com/4d3680a0c31eb420689e54ed1315370abda74ed05b4e9447f48a67c505e1e5f1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246316236653831306365306166643064643039336637383964326232373432643034376533313664352533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-01 |
LaMDA |
Google |
LaMDA: Language Models for Dialog Applications |
![Dynamic JSON Badge](https://camo.githubusercontent.com/8dc58a723405f4eb06547d8d7ff19a8e5186eb10af966a74e7229080e46314ef/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246623338343864333266373239346563373038363237383937383333633430393765623464383737382533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-01 |
Minerva |
Google |
Solving Quantitative Reasoning Problems with Language Models |
NeurIPS ![Dynamic JSON Badge](https://camo.githubusercontent.com/b43e2146a7b7e27738a6af74ab1b5abe421a52199b5e4690660aee924c00547c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246616230653364336534643432333639646535393333613362346332333737383062343163306437372533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-01 |
Megatron-Turing NLG |
Microsoft&NVIDIA |
Using Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model |
![Dynamic JSON Badge](https://camo.githubusercontent.com/dfd67f9c157efd04e0d177eeffdeaddfb9963be41cbfaec55d129f9122e49e66/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246376362633261373834333431316131373638616237363239333037303761663061336333336131392533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-03 |
InstructGPT |
OpenAI |
Training language models to follow instructions with human feedback |
![Dynamic JSON Badge](https://camo.githubusercontent.com/fb9391d8a0bee20b0bd357b9d452df784c5b08ec658ac78262d7498fe9787247/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246643736366266666333353731323765306463383664643639353631643561656235323064366634632533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-04 |
PaLM |
Google |
PaLM: Scaling Language Modeling with Pathways |
![Dynamic JSON Badge](https://camo.githubusercontent.com/b1d3a9946086f53eeeed8313b76ce725dac9d9d42ddb851fc513fe6189fcbad2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246303934666639373164366138623866663837303934366339623363653561613137333631376266622533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-04 |
Chinchilla |
DeepMind |
An empirical analysis of compute-optimal large language model training |
NeurIPS ![Dynamic JSON Badge](https://camo.githubusercontent.com/d6d5d5a0da992272d7a60b66190f620cb329a1a91d1f932e3b841eeb8e10ed1e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246626230363536303331636231376164663662616335666430666538643533646439633239313530382533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-05 |
OPT |
Meta |
OPT: Open Pre-trained Transformer Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/ba9aeab8e31bbb7b3b7b119fb462cf068a09e1b7a122e59fa6e7083b152bba67/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246313361306438626233386637333939393063386364363561343430363163363533346631373232312533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-05 |
UL2 |
Google |
Unifying Language Learning Paradigms |
ICLR
![Dynamic JSON Badge](https://camo.githubusercontent.com/c78ecbc479a5d7cc95f36020e3ba999cbe5c89f7fe738b50950eeae8ef59844d/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246663430616561653365353232616461316636613966333236383431623031656635633836353762362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-06 |
Emergent Abilities |
Google |
Emergent Abilities of Large Language Models |
TMLR
![Dynamic JSON Badge](https://camo.githubusercontent.com/8ea08731bf40e11e08df273acfe0554d53409584a46fdea3cf1726eb59c2e39a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246646163336131373262353034663465333363303239363535653962656662333338366535663633612533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-06 |
BIG-bench |
Google |
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/103cfbc5322597c0c4d1657317708d1a04ff14dc5b4f853e9fe14e03d9819049/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246333435303363306236613631353132346561663832636230653461316461623238363665383938302533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-06 |
METALM |
Microsoft |
Language Models are General-Purpose Interfaces |
![Dynamic JSON Badge](https://camo.githubusercontent.com/b13eb57814707d9ca33f3a126ac528183e876ef255d7e022a5721d99270f65e7/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246613866643963313632353031313734316637343430316666396264633163353834653235633836642533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-09 |
Sparrow |
DeepMind |
Improving alignment of dialogue agents via targeted human judgements |
![Dynamic JSON Badge](https://camo.githubusercontent.com/0dcedbcf0cc1cb99e7abddc7e7a61d7f762ab28c5652bfefd23f848b8076f0e9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246373465616531323632306264316331333933653236386264646362366631323961353032353136362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-10 |
Flan-T5/PaLM |
Google |
Scaling Instruction-Finetuned Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/4eaa3e7fead7e30408fd79e74a33043bb1ae29eaf17a9814f3062d62ed7828d7/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246353438346432323862666335306566626163366538363637376263326563326565346564653161362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-10 |
GLM-130B |
Tsinghua |
GLM-130B: An Open Bilingual Pre-trained Model |
ICLR ![Dynamic JSON Badge](https://camo.githubusercontent.com/039cb8a9bbd62eefad45e661b75c34c69ca0f895cc9be38013b1267842b15de4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246316432366339343734303631373331343561343636356464376162323535653033343934656132382533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-11 |
HELM |
Stanford |
Holistic Evaluation of Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/513a5f276b4b6970bbd801d3cc7107a146ee9216fb7b393adbe65bbe044a688f/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246353033326330393436656539366666313161323932373632663233653633373761366366323733312533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-11 |
BLOOM |
BigScience |
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model |
![Dynamic JSON Badge](https://camo.githubusercontent.com/47865a32361d4b7ec66579ea392b3847c0bf838368f0e5b3d22190d4493f32ca/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246393634626433396235343666306636363235666633623965663130383366373937383037656632652533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-11 |
Galactica |
Meta |
Galactica: A Large Language Model for Science |
![Dynamic JSON Badge](https://camo.githubusercontent.com/dfa51e218947d29e5a6d75aadece68b8aa7fe41cde8b5bb22a95ce9e0e7900cb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246376436343561336664323736393138333734666439343833666436373563323865343635303664312533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2022-12 |
OPT-IML |
Meta |
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization |
![Dynamic JSON Badge](https://camo.githubusercontent.com/c0bee120491ca754604eb2195603741c883769f1c0f2bfbc3d65ec34a27623d1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246653936356539336537366139653663346534383633643134356235633030376235343064353735642533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-01 |
Flan 2022 Collection |
Google |
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning |
ICML
![Dynamic JSON Badge](https://camo.githubusercontent.com/71d1863a1bc6995add2070ae49116c83b3a9044dd21f4a0298b06462620c39b0/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246663262303031376464643737666133383736306131383134356536333535333130356131613233362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-02 |
LLaMA |
Meta |
LLaMA: Open and Efficient Foundation Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/b6c1158de062e514500a6e100b4be97ec078b0f476d4a9726243cb3483b6a6c1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246353765383439643064653133656435663931643038363933363239363732316434666637356137352533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-02 |
Kosmos-1 |
Microsoft |
Language Is Not All You Need: Aligning Perception with Language Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/35a33ba57734a3331191fb082c8d658fc496b078b620a3e22f725245a38520c1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246666266656634373233643863383436376437626435323365316430623730336363653065306639632533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-03 |
PaLM-E |
Google |
PaLM-E: An Embodied Multimodal Language Model |
ICML
![Dynamic JSON Badge](https://camo.githubusercontent.com/40bd65bfab0b050d319e0f222b4a607f3da95c49035c380b5b778bd80822b721/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246333866653866333234643231363265363361393637613961633636343839373466633463363666332533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-03 |
GPT 4 |
OpenAI |
GPT-4 Technical Report |
![Dynamic JSON Badge](https://camo.githubusercontent.com/2efa4aadd71318cdbb4d705f38c0b88fbbacba991f960326e6df071ae596044a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246386361363266646634633237366561333035326463393664636664386565393663613432356134382533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-04 |
Pythia |
EleutherAI et al. |
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling |
ICML
![Dynamic JSON Badge](https://camo.githubusercontent.com/7db85ac9586ad7cd2a213f842f0e64a46b90a9076aa86ba247d563367fc8e0bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246626535356538656334323133383638646230386632633331363861653636363030316265613462382533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-05 |
Dromedary |
CMU et al. |
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision |
NeurIPS
![Dynamic JSON Badge](https://camo.githubusercontent.com/8282d7d4402830eb9c13024f8108f8b4ec99c5760dc0312b5272412be6e219d5/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246653031353135633631333862633532356637616563333066633835663261646630323864343135362533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-05 |
PaLM 2 |
Google |
PaLM 2 Technical Report |
![Dynamic JSON Badge](https://camo.githubusercontent.com/c5b2a37ec01687375fdacd1561d58a7448a38a6aad38d7b74d6490b70b53ca19/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246656363656533353036393137303839373233373062376131326332613738616433626464643135392533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-05 |
RWKV |
Bo Peng |
RWKV: Reinventing RNNs for the Transformer Era |
EMNLP
![Dynamic JSON Badge](https://camo.githubusercontent.com/45454dc7626d784029a7ffe010cfb5b8808414cb298ab3addafdca3062085748/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246303236623333393661363365643537373233323937303862373538306436333362623836626563392533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-05 |
DPO |
Stanford |
Direct Preference Optimization: Your Language Model is Secretly a Reward Model |
Neurips
![Dynamic JSON Badge](https://camo.githubusercontent.com/8b3aacb13d48d9ed052570a959cb38f2a78fc9a1fa9525de9e15f85ba7e6cd0f/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246306431633736643435616661303132646564376162373431313934626166313432313137633439352533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-05 |
ToT |
Google&Princeton |
Tree of Thoughts: Deliberate Problem Solving with Large Language Models |
NeurIPS
![Dynamic JSON Badge](https://camo.githubusercontent.com/62de436aee8498dd29cdff92615e6f65d11b69abfc7c90fe95be4817edc537a9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246326633383232656233383062356537353361366435373966333164666333656334633461303832302533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-07 |
LLaMA 2 |
Meta |
Llama 2: Open Foundation and Fine-Tuned Chat Models |
![Dynamic JSON Badge](https://camo.githubusercontent.com/d1f880cca72a8038a43fd02430e34b5c2798f50434346076145df1a0880c7649/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246313034623062623164613536326435336362646138376165633739656636613238323764313931612533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-10 |
Mistral 7B |
Mistral |
Mistral 7B |
![Dynamic JSON Badge](https://camo.githubusercontent.com/8888d503456c488bf68093e42808421713f76b49ad79de27f3e580a2fd054120/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246646236333363366231633238366330333836663030373864386132653632323465303361363232372533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |
2023-12 |
Mamba |
CMU&Princeton |
Mamba: Linear-Time Sequence Modeling with Selective State Spaces |
![Dynamic JSON Badge](https://camo.githubusercontent.com/cdf1294bc1c1be5fc95f334fca18a7840a1d35365dfb9b374b3acceb8d68cd76/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d68747470732533412532462532466170692e73656d616e7469637363686f6c61722e6f7267253246677261706825324676312532467061706572253246343332626566386533343031346437323663363734626334353830303861633839353239376235312533466669656c64732533446369746174696f6e436f756e742671756572793d2532342e6369746174696f6e436f756e74266c6162656c3d6369746174696f6e) |