Release pretraining data?
leannmlindsey opened this issue · comments
I was just wondering if you have released the exact dataset that you used for pretraining DNABERT1 and DNABERT2?
I would be interested in doing some ablation studies using this dataset.
Thank you,
LeAnn