Converting a composer seq2seq t5 model throws an exception
timsteuer opened this issue · comments
Environment
- llm-foundry: latest
To reproduce
Steps to reproduce the behavior:
- train a
hf_t5
model - download the composer checkpoint
- try to convert it back to huggingface via
scripts/inference/convert_composer_to_hf.py
- The script crashes when trying to load the saved model as AutoModelForCausalLM
Expected behavior
The model is saved as a HuggingFace snapshot without any issue
Additional context
Locally, I fixed this via simply loading with AutoModel
and not via AutoModelForCausalLM
.
I guess this is fine.
Ah yes, that script only support causal lms right now. A note on your solution, I'm not certain, but AutoModel
here may give you a T5Model
rather than a T5ForConditionalGeneration
as you may want. Probably worth double checking that.
That was an interesting hint.
Just double checked and the model was indeed marked as a T5Model
and not as a T5ForConditionalGeneration
.
So I changed that in the conversion script, such that it yields the right config. However, loading the final model via AutoModel
still results in a T5Model
even though the config now explicitly states the correct model type.
On the other hand, if I load via AutoModelForSeq2SeqLM
it loads the lm_head
. So, I guess that is a HF specific thing and not related to the conversion script per se.
Yeah, AutoModel
generally gives you the backbone model, while the AutoModelForXYZ
will give you the model with adaptation/head for XYZ.