jerryji1993 / DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Home Page:https://doi.org/10.1093/bioinformatics/btab083

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Customizing training data for fine-tuning the model

sumin5784 opened this issue · comments

Hello,

I'm trying to use my own dataset to fine-tune DNABERT-5/6.
I have several questions about this.

  1. All input sequence length should be the same? Or input sequences can be different length?
  2. Does label should be 1/0? Basically, does classification class can be more than two classes?

Any feedback would be appreciated.
Thank you!

Excuse me, do you understand these two questions?