Converting a composer seq2seq t5 model throws an exception

Question

Converting a composer seq2seq t5 model throws an exception

timsteuer opened this issue 10 months ago · comments

Tim Steuer commented 10 months ago

Environment

llm-foundry: latest

To reproduce

Steps to reproduce the behavior:

train a hf_t5 model
download the composer checkpoint
try to convert it back to huggingface via scripts/inference/convert_composer_to_hf.py
The script crashes when trying to load the saved model as AutoModelForCausalLM

Expected behavior

The model is saved as a HuggingFace snapshot without any issue

Additional context

Locally, I fixed this via simply loading with AutoModel and not via AutoModelForCausalLM.
I guess this is fine.

Daniel King · Answer 1 · Tue Nov 21 2023 16:13:38 GMT+0800 (China Standard Time)

Ah yes, that script only support causal lms right now. A note on your solution, I'm not certain, but AutoModel here may give you a T5Model rather than a T5ForConditionalGeneration as you may want. Probably worth double checking that.

Tim Steuer · Answer 2 · Tue Nov 21 2023 18:03:18 GMT+0800 (China Standard Time)

That was an interesting hint.

Just double checked and the model was indeed marked as a T5Model and not as a T5ForConditionalGeneration.

So I changed that in the conversion script, such that it yields the right config. However, loading the final model via AutoModel still results in a T5Model even though the config now explicitly states the correct model type.

On the other hand, if I load via AutoModelForSeq2SeqLM it loads the lm_head. So, I guess that is a HF specific thing and not related to the conversion script per se.

Daniel King · Answer 3 · Wed Nov 22 2023 01:33:41 GMT+0800 (China Standard Time)

Yeah, AutoModel generally gives you the backbone model, while the AutoModelForXYZ will give you the model with adaptation/head for XYZ.