jerryji1993 / DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Home Page:https://doi.org/10.1093/bioinformatics/btab083

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Release pretraining data?

leannmlindsey opened this issue · comments

I was just wondering if you have released the exact dataset that you used for pretraining DNABERT1 and DNABERT2?

I would be interested in doing some ablation studies using this dataset.

Thank you,
LeAnn