jerryji1993 / DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Home Page:https://doi.org/10.1093/bioinformatics/btab083

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changing max_seq_length does not update max_length in config.json

jackievaleri opened this issue · comments

I was wondering if this behavior is intended. For instance when I run run_finetune.py with the following code:

python run_finetune.py --model_type dna --tokenizer_name=dna$KMER --model_name_or_path $MODEL_PATH --task_name dnaprom --do_train --data_dir $DATA_PATH --per_gpu_eval_batch_size=32 --per_gpu_train_batch_size=32 --learning_rate 2e-4 --output_dir $OUTPUT_PATH --logging_steps 100 --save_steps 4000 --warmup_percent 0.1 --overwrite_output --weight_decay 0.01 --n_process 8 --max_seq_length 59 --hidden_dropout_prob 0.1 --num_train_epochs 5.0

The config.json file still has "max_length": 20. Should I be editing the config.json file prior to finetuning?

Thanks so much for your help!