模型文件`config.json`中`torch_dtype`参数似乎没有起左右

Question

模型文件`config.json`中`torch_dtype`参数似乎没有起左右

diuzi opened this issue 8 months ago · comments

在一张A10显卡加载chat模型爆显存了,代码如下

    llm_model = AutoModelForCausalLM.from_pretrained(
        llm_path,
        device_map="auto",
        trust_remote_code=True,
        # use_flash_attention_2=True,
    ).eval()

看了config.json中`torch_dtype="bfloat16", 但似乎模型是以fp32加载的，主动指定类型成功加载：

    llm_model = AutoModelForCausalLM.from_pretrained(
        llm_path,
        device_map="auto",
        trust_remote_code=True,
        # use_flash_attention_2=True,
        torch_dtype=torch.bfloat16,
    ).eval()

pororo24 · Answer 1 · Wed Nov 29 2023 20:16:25 GMT+0800 (China Standard Time)

可以看下官方文档：
torch_dtype (str, optional) — The dtype of the weights. This attribute can be used to initialize the model to a non-default dtype (which is normally float32) and thus allow for optimal storage allocation. For example, if the saved model is float16, ideally we want to load it back using the minimal amount of memory needed to load float16 weights. Since the config object is stored in plain text, this attribute contains just the floating type string without the torch. prefix. For example, for torch.float16 `torch_dtype is the "float16" string.
This attribute is currently not being used during model loading time, but this may change in the future versions. But we can already start preparing for the future by saving the dtype with save_pretrained.

https://huggingface.co/docs/transformers/v4.35.2/en/main_classes/configuration#transformers.PretrainedConfig