train_model model load repaired

Question

train_model model load repaired

JornyWan opened this issue 10 months ago · comments

In the code train_model.py:
model = AutoModelForSeq2SeqLM.from_pretrained(
model_args.model_name_or_path,
from_tf=bool(".ckpt" in model_args.model_name_or_path),
config=config,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision,
use_auth_token=True if model_args.use_auth_token else None,
)
if anyone could not use this to initial a flan-t5 model from AutoModelForSeq2SeqLM, then you need to add the following in params:
unk_token="",
bos_token="",
eos_token=""
Thanks!

Qian · Answer 1 · Wed Oct 25 2023 23:18:58 GMT+0800 (China Standard Time)

@JornyWan Thanks for your feedback and pull request! May I know your transformers version? I have never encountered the problem, not sure if it is the case for the latest transformers library.

jornywan · Answer 2 · Wed Oct 25 2023 23:31:46 GMT+0800 (China Standard Time)

@SivilTaram thanks for quick response, my transformer version is 4.31.0

jornywan · Answer 3 · Wed Oct 25 2023 23:32:59 GMT+0800 (China Standard Time)

actually it would be like:

jornywan · Answer 4 · Wed Oct 25 2023 23:33:44 GMT+0800 (China Standard Time)

@SivilTaram you can just put it to another branch for different version of transformer use if it is the problem with version