1 |
2019/01 |
BioBERT |
BERT |
MLM, NSP |
110M |
GitHub |
2 |
2019/02 |
BERT-MIMIC |
BERT |
MLM, NSP |
110M, 340M |
N/A |
3 |
2019/04 |
BioELMo |
ELMo |
Bi-LM |
93.6M |
GitHub |
4 |
2019/04 |
Clinical BERT (Emily) |
BERT |
MLM, NSP |
110M |
GitHub |
5 |
2019/04 |
ClinicalBERT (Kexin) |
BERT |
MLM, NSP |
110M |
GitHub |
6 |
2019/06 |
BlueBERT |
BERT |
MLM, NSP |
110M, 340M |
GitHub |
7 |
2019/06 |
G-BERT |
GNN + BERT |
Self-Prediction, Dual-Prediction |
3M |
GitHub |
8 |
2019/07 |
BEHRT |
BERT |
MLM, NSP |
N/A |
GitHub |
9 |
2019/08 |
BioFLAIR |
FLAIR |
Bi-LM |
N/A |
GitHub |
10 |
2019/09 |
EhrBERT |
BERT |
MLM, NSP |
110M |
GitHub |
11 |
2019/12 |
Clinical XLNet |
XLNet |
Generalized Autoregressive Pretraining |
110M |
GitHub |
12 |
2020/04 |
GreenBioBERT |
BERT |
CBOW Word2Vec, Word Vector Space Alignment |
110M |
GitHub |
13 |
2020/05 |
BERT-XML |
BERT |
MLM, NSP |
N/A |
N/A |
14 |
2020/05 |
Bio-ELECTRA |
ELECTRA |
Replaced Token Prediction |
14M |
GitHub |
15 |
2020/05 |
Med-BERT |
BERT |
MLM, Prolonged LOS Prediction |
110M |
GitHub |
16 |
2020/05 |
ouBioBERT |
BERT |
MLM, NSP |
110M |
GitHub |
17 |
2020/07 |
PubMedBERT |
BERT |
MLM, NSP, Whole-Word Masking |
110M |
HuggingFace |
18 |
2020/08 |
MCBERT |
BERT |
MLM, NSP |
110M, 340M |
GitHub |
19 |
2020/09 |
BioALBERT |
ALBERT |
MLM, SOP |
12M, 18M |
GitHub |
20 |
2020/09 |
BRLTM |
BERT |
MLM |
N/A |
GitHub |
21 |
2020/10 |
BioMegatron |
Megatron |
MLM, NSP |
345M, 800M, 1.2B |
GitHub |
22 |
2020/10 |
CharacterBERT |
BERT + Character-CNN |
MLM, NSP |
105M |
GitHub |
23 |
2020/10 |
ClinicalTransformer |
BERT - ALBERT - RoBERTa - ELECTRA |
MLM, NSP - MLM, SOP - MLM - Replaced Token Prediction |
110M - 12M - 125M - 110M |
GitHub |
24 |
2020/10 |
SapBERT |
BERT |
Multi-Similarity Loss |
110M |
GitHub |
25 |
2020/10 |
UmlsBERT |
BERT |
MLM |
110M |
GitHub |
26 |
2020/11 |
bert-for-radiology |
BERT |
MLM, NSP |
110M |
GitHub |
27 |
2020/11 |
Bio-LM |
RoBERTa |
MLM |
125M, 355M |
GitHub |
28 |
2020/11 |
CODER |
PubMedBERT - mBERT |
Contrastive Learning |
110M - 110M |
GitHub |
29 |
2020/11 |
exBERT |
BERT |
MLM, NSP |
N/A |
GitHub |
30 |
2020/12 |
BioMedBERT |
BERT |
MLM, NSP |
340M |
GitHub |
31 |
2020/12 |
LBERT |
BERT |
MLM, NSP |
110M |
GitHub |
32 |
2021/04 |
CovidBERT |
BioBERT |
MLM, NSP |
110M |
N/A |
33 |
2021/04 |
ELECTRAMed |
ELECTRA |
Replaced Token Prediction |
N/A |
GitHub |
34 |
2021/04 |
KeBioLM |
PubMedBERT |
MLM, Entity Detection, Entity Linking |
110M |
GitHub |
35 |
2021/04 |
SINA-BERT |
BERT |
MLM |
110M |
N/A |
36 |
2021/05 |
ProteinBERT |
BERT |
Corrupted Token, Annotation Prediction |
16M |
GitHub |
37 |
2021/05 |
SciFive |
T5 |
Span Corruption Prediction |
220M, 770M |
GitHub |
38 |
2021/06 |
BioELECTRA |
ELECTRA |
Replaced Token Prediction |
110M |
GitHub |
39 |
2021/06 |
EntityBERT |
BERT |
Entity-centric MLM |
110M |
N/A |
40 |
2021/07 |
MedGPT |
GPT-2 + GLU + RotaryEmbed |
LM |
N/A |
N/A |
41 |
2021/08 |
SMedBERT |
SMedBERT |
Masked Neighbor Modeling, Masked Mention Modeling, SOP, MLM |
N/A |
GitHub |
42 |
2021/09 |
Bio-cli |
RoBERTa |
MLM, Subword Masking or Whole Word Masking |
125M |
GitHub |
43 |
2021/11 |
UTH-BERT |
BERT |
MLM, NSP |
110M |
GitHub |
44 |
2021/12 |
ChestXRayBERT |
BERT |
MLM, NSP |
110M |
N/A |
45 |
2021/12 |
MedRoBERTa.nl |
RoBERTa |
MLM |
123M |
GitHub |
46 |
2021/12 |
PubMedELECTRA |
ELECTRA |
Replaced Token Prediction |
110M, 335M |
HuggingFace |
47 |
2022/01 |
Clinical-BigBird |
BigBird |
MLM |
166M |
GitHub |
48 |
2022/01 |
Clinical-Longformer |
Longformer |
MLM |
149M |
GitHub |
49 |
2022/03 |
BioLinkBERT |
BERT |
MLM, Document Relation Prediction |
110M, 340M |
GitHub |
50 |
2022/04 |
BioBART |
BART |
Text Infilling, Sentence Permutation |
140M, 400M |
GitHub |
51 |
2022/05 |
bsc-bio-ehr-es |
RoBERTa |
MLM |
125M |
GitHub |
52 |
2022/05 |
PathologyBERT |
BERT |
MLM, NSP |
110M |
HuggingFace |
53 |
2022/06 |
RadBERT |
RoBERTa |
MLM |
110M |
GitHub |
54 |
2022/06 |
ViHealthBERT |
BERT |
MLM, NSP, Capitalized Prediction |
110M |
GitHub |
55 |
2022/07 |
Clinical Flair |
Flair |
Character-level Bi-LM |
N/A |
GitHub |
56 |
2022/08 |
KM-BERT |
BERT |
MLM, NSP |
99M |
GitHub |
57 |
2022/09 |
BioGPT |
GPT |
Autoregressive Language Model |
347M, 1.5B |
GitHub |
58 |
2022/10 |
Bioberturk |
BERT |
MLM, NSP |
N/A |
GitHub |
59 |
2022/10 |
DRAGON |
GreaseLM |
MLM, KG Link Prediction |
360M |
GitHub |
60 |
2022/10 |
UCSF-BERT |
BERT |
MLM, NSP |
135M |
N/A |
61 |
2022/10 |
ViPubmedT5 |
ViT5 |
Spans-masking learning |
220M |
GitHub |
62 |
2022/12 |
ALIBERT |
BERT |
MLM |
110M |
N/A |
63 |
2022/12 |
BioMedLM |
GPT2 |
Autoregressive Language Model |
2.7B |
GitHub |
64 |
2022/12 |
BioReader |
T5 & RETRO |
MLM |
229.5M |
GitHub |
65 |
2022/12 |
clinicalT5 |
T5 |
Span-mask Denoising Objective |
220M, 770M |
N/A |
66 |
2022/12 |
Gatortron |
BERT |
MLM |
8.9B |
GitHub |
67 |
2022/12 |
Med-PaLM |
Flan-PaLM |
Instruction Prompt Tuning |
540B |
Official Site |
68 |
2023/01 |
clinical-T5 |
T5 |
Fill-in-the-blank-style denoising objective |
220M, 770M |
PhysioNet |
69 |
2023/01 |
CPT-BigBird |
BigBird |
MLM |
166M |
N/A |
70 |
2023/01 |
CPT-Longformer |
Longformer |
MLM |
149M |
N/A |
71 |
2023/02 |
Bioformer |
Bioformer |
MLM, NSP |
43M |
GitHub |
72 |
2023/02 |
Lightweight |
DistilBERT |
MLM, Knowledge Distillation |
65M, 25M, 18M, 15M |
GitHub |
73 |
2023/03 |
RAMM |
PubmedBERT |
MLM, Contrastive Learning, Image-Text Matching |
N/A |
GitHub |
74 |
2023/04 |
DrBERT |
RoBERTa |
MLM |
110M |
GitHub |
75 |
2023/04 |
MOTOR |
BLIP |
MLM, Contrastive Learning, Image-Text Matching |
N/A |
GitHub |
76 |
2023/05 |
BiomedGPT |
BART backbone + BERT-encoder + GPT-decoder |
MLM |
33M, 93M, 182M |
GitHub |
77 |
2023/05 |
TurkRadBERT |
BERT |
MLM, NSP |
110M |
N/A |
78 |
2023/06 |
CamemBERT-bio |
BERT |
Whole Word MLM |
111M |
HuggingFace |
79 |
2023/06 |
ClinicalGPT |
T5 |
Supervised Fine Tuning, Rank-based Training |
N/A |
N/A |
80 |
2023/06 |
EriBERTa |
RoBERTa |
MLM |
125M |
N/A |
81 |
2023/06 |
PharmBERT |
BERT |
MLM |
110M |
GitHub |
82 |
2023/07 |
BioNART |
BERT |
Non-AutoRegressive Model |
110M |
GitHub |
83 |
2023/07 |
BIOptimus |
BERT |
MLM |
110M |
GitHub |
84 |
2023/07 |
KEBLM |
BERT |
MLM, Contrastive Learning, Ranking Objective |
N/A |
N/A |
85 |
2023/09 |
CPLLM |
Llama2 |
Autoregressive Language Model, Supervised Fine Tuning |
13B |
GitHub |
86 |
2023/11 |
MedCPT |
BERT |
Contrastive Learning, Ranking Objective |
110M |
GitHub |