mosaicml / llm-foundry

LLM training code for Databricks foundation models

Home Page:https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MPT models on the Hub not working with `transformers` main

younesbelkada opened this issue · comments

Hi there!

Currently with transformers main loading MPT models from the Hub fails because it tries to import some private method (such as _expand_mask ) that has been recently removed: huggingface/transformers#27086

The simple loading script below should work

from accelerate import init_empty_weights
from transformers import AutoModelForCausalLM, AutoConfig

model_id = "mosaicml/mpt-7b"
config = AutoConfig.from_pretrained(
    model_id, trust_remote_code=True
)
with init_empty_weights():
    model = AutoModelForCausalLM.from_config(
        config, trust_remote_code=True
    )

Thanks for letting us know Younes, will look into this ASAP

@younesbelkada this should be resolved in the foundry code now, and I'm uploading the updated code to the hf hub as we speak.

Ok, this should be resolved completely now. Let me know if you see otherwise! Thanks again for the report :)

Works now like charm! Thanks for the quick fix @dakinggg