Reproducing model conversions

Question

Reproducing model conversions

thekevinscott opened this issue 2 months ago · comments

Question

I'm trying to reproduce the conversion of phi-1_5_dev to better understand the process. I'm running into a few bugs / issues along the way that I thought it'd be helpful to document.

The model @Xenova/phi-1_5_dev states:

https://huggingface.co/susnato/phi-1_5_dev with ONNX weights to be compatible with Transformers.js.

I'm doing the following:

git clone https://github.com/xenova/transformers.js.git && cd transformers.js/scripts
git clone https://huggingface.co/susnato/phi-1_5_dev
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation"

Here, I hit my first issue - it looks like transformers on pypi does not support Phi:

    raise KeyError(key)
KeyError: 'phi'

So I install from Github:

pip install git+https://github.com/huggingface/transformers.git

That produces:

RuntimeError: Failed to import optimum.exporters.onnx.__main__ because of the following error (look up to see its traceback):
cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/Users/thekevinscott/code/codegen/research/model-conversion/throwaway/transformers.js/scripts/.venv/lib/python3.10/site-packages/transformers/pytorch_utils.py)

I believe optimum is also out of date:

pip install git+https://github.com/huggingface/optimum.git

With those two dependencies updated, this command now works:

python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation"

Though there are a few warnings I'm assuming I can ignore:

Ignore MatMul due to non constant B: /[/model/layers.22/self_attn/MatMul]
Ignore MatMul due to non constant B: /[/model/layers.22/self_attn/MatMul_1]
Ignore MatMul due to non constant B: /[/model/layers.23/self_attn/MatMul]
Ignore MatMul due to non constant B: /[/model/layers.23/self_attn/MatMul_1]

However, out of the box it can't find the right onnx file:

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "transformers.js/scripts/models/phi-1_5_dev/onnx/decoder_model_merged_quantized.onnx".

I see in the @Xenova repo history that the files were manually renamed; I'll try that too:

mv model.onnx decoder_model_merged.onnx
mv model_quantized.onnx decoder_model_merged_quantized.onnx
mv model.onnx_data decoder_model_merged.onnx_data

I then try to run the model with:

  const model = await loadModel('transformers.js/scripts/models/phi-1_5_dev', {
  });

  const result = await model('Write me a list of numbers:\n', {
  });
  console.log('result', result);

The model loads, but upon generating I see:

WARNING: Too many inputs were provided (51 > 3). The following inputs will be ignored: "past_key_values.0.key, past_key_values.0.value, past_key_values.1.key, past_key_values.1.value, past_key_values.2.key, past_key_values.2.value, past_key_values.3.key, past_key_values.3.value, past_key_values.4.key, past_key_values.4.value, past_key_values.5.key, past_key_values.5.value, past_key_values.6.key, past_key_values.6.value, past_key_values.7.key, past_key_values.7.value, past_key_values.8.key, past_key_values.8.value, past_key_values.9.key, past_key_values.9.value, past_key_values.10.key, past_key_values.10.value, past_key_values.11.key, past_key_values.11.value, past_key_values.12.key, past_key_values.12.value, past_key_values.13.key, past_key_values.13.value, past_key_values.14.key, past_key_values.14.value, past_key_values.15.key, past_key_values.15.value, past_key_values.16.key, past_key_values.16.value, past_key_values.17.key, past_key_values.17.value, past_key_values.18.key, past_key_values.18.value, past_key_values.19.key, past_key_values.19.value, past_key_values.20.key, past_key_values.20.value, past_key_values.21.key, past_key_values.21.value, past_key_values.22.key, past_key_values.22.value, past_key_values.23.key, past_key_values.23.value".
2024-04-15 11:00:50.956 node[91488:12372370] 2024-04-15 11:00:50.956090 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]
An error occurred during model execution: "Error: Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]".
Inputs given to model: [Object: null prototype] {
  input_ids: Tensor {
    dims: [ 1, 1 ],
    type: 'int64',
    data: BigInt64Array(1) [ 13n ],
    size: 1
  },
  attention_mask: Tensor {
    dims: [ 1, 9 ],
    type: 'int64',
    data: BigInt64Array(9) [
      1n, 1n, 1n, 1n, 1n,
      1n, 1n, 1n, 1n
    ],
    size: 9
  },
  position_ids: Tensor {
    dims: [ 1, 1 ],
    type: 'int64',
    data: BigInt64Array(1) [ 8n ],
    size: 1
  }
}
node_modules/.pnpm/onnxruntime-node@1.14.0/node_modules/onnxruntime-node/dist/backend.js:45
                    resolve(__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").run(feeds, fetches, options));
                                                                                                           ^

Error: Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]
    at node_modules/.pnpm/onnxruntime-node@1.14.0/node_modules/onnxruntime-node/dist/backend.js:45:108
    at process.processTicksAndRejections (node:internal/process/task_queues:77:11)

Node.js v20.12.1

❌ [dev] exited with exit code 1.
❌ 1 script failed.

I'm not entirely sure to proceed from here. Any suggestions? It seems to be something specific to the .onnx file, as if I replace it with the .onnx file from the @Xenova repo it works perfectly.

Kevin Scott · Answer 1 · Tue Apr 16 2024 19:19:45 GMT+0800 (China Standard Time)

It looks as though the initial model is missing inputNames.

The (working) model (@Xenova/phi-1_5_dev) has:

inputNames: [
      'input_ids',
      'attention_mask',
      'position_ids',
      'past_key_values.0.key',
      'past_key_values.0.value',
      'past_key_values.1.key',
      'past_key_values.1.value',
      'past_key_values.2.key',
      'past_key_values.2.value',
      'past_key_values.3.key',
      'past_key_values.3.value',
      'past_key_values.4.key',
      'past_key_values.4.value',
      'past_key_values.5.key',
      'past_key_values.5.value',
      'past_key_values.6.key',
      'past_key_values.6.value',
      'past_key_values.7.key',
      'past_key_values.7.value',
      'past_key_values.8.key',
      'past_key_values.8.value',
      'past_key_values.9.key',
      'past_key_values.9.value',
      'past_key_values.10.key',
      'past_key_values.10.value',
      'past_key_values.11.key',
      'past_key_values.11.value',
      'past_key_values.12.key',
      'past_key_values.12.value',
      'past_key_values.13.key',
      'past_key_values.13.value',
      'past_key_values.14.key',
      'past_key_values.14.value',
      'past_key_values.15.key',
      'past_key_values.15.value',
      'past_key_values.16.key',
      'past_key_values.16.value',
      'past_key_values.17.key',
      'past_key_values.17.value',
      'past_key_values.18.key',
      'past_key_values.18.value',
      'past_key_values.19.key',
      'past_key_values.19.value',
      'past_key_values.20.key',
      'past_key_values.20.value',
      'past_key_values.21.key',
      'past_key_values.21.value',
      'past_key_values.22.key',
      'past_key_values.22.value',
      'past_key_values.23.key',
      'past_key_values.23.value'
    ],

Whereas the converted model (susnato/phi-1_5_dev) is missing the past_key_values fields:

inputNames: [ 'input_ids', 'attention_mask', 'position_ids' ],

Is there some step in the conversion I'm missing that includes these inputNames?

Jared Van Valkengoed · Answer 2 · Thu May 09 2024 20:13:34 GMT+0800 (China Standard Time)

@thekevinscott - I am not that experienced in this field but just doing some playing and running into same issue with all of Xenova's models (almost).

Hopefully this might help you here.

@xenova - hoping you see this. Various models you have deployed are running into this issue. (Which I am grateful for your work!)

Joshua Lochner · Answer 3 · Thu May 09 2024 20:56:47 GMT+0800 (China Standard Time)

Hi there 👋 The correct task is text-generation-with-past (note: -with-past suffix). So, the command would be:

python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation-with-past"

@MarketingPip can you provide a list of these models?

Kevin Scott · Answer 4 · Thu May 09 2024 23:10:37 GMT+0800 (China Standard Time)

Thanks for the discussion and the responses. I've been trying to implement this updated command, but dependencies seem to have shifted since I last posted. I'm trying to move these commands into a Dockerfile but am now running into new errors.

I have to step away from this but will pick it up later today; maybe it's helpful to share my progress so far.

The main challenges I'm seeing are:

transformers listed in requirements.txt is out of date, and has to be installed from Github
optimum listed in requirements.txt is out of date, and has to be installed from Github

FROM python:3.9

RUN apt-get update \
  && apt-get install -y \
  less \
  vim \
  git \
  git-lfs \
  # enable h5py wheels
  libhdf5-dev

RUN git lfs install

WORKDIR /code
RUN git clone https://github.com/xenova/transformers.js.git
WORKDIR /code/transformers.js/scripts
RUN git clone https://huggingface.co/susnato/phi-1_5_dev
RUN python3 -m pip install -r requirements.txt


# /usr/local/lib/python3.9/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
#   torch.utils._pytree._register_pytree_node(
# Traceback (most recent call last):
#   File "/code/transformers.js/scripts/convert.py", line 545, in <module>
#     main()
#   File "/code/transformers.js/scripts/convert.py", line 340, in main
#     config = AutoConfig.from_pretrained(model_id, **from_pretrained_kwargs)
#   File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
#     config_class = CONFIG_MAPPING[config_dict["model_type"]]
#   File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 734, in __getitem__
#     raise KeyError(key)
# KeyError: 'phi'
RUN python3 -m pip install git+https://github.com/huggingface/transformers.git@df53c6e5d9245315c741ba6cce1e026d4ca104c5



# Traceback (most recent call last):
#   File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1530, in _get_module
#     return importlib.import_module("." + module_name, self.__name__)
#   File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
#     return _bootstrap._gcd_import(name[level:], package, level)
#   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
#   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
#   File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
#   File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
#   File "<frozen importlib._bootstrap_external>", line 850, in exec_module
#   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
#   File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/__main__.py", line 32, in <module>
#     from .convert import export_models, validate_models_outputs
#   File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/convert.py", line 48, in <module>
#     from transformers.pytorch_utils import is_torch_less_than_1_11
# ImportError: cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/usr/local/lib/python3.9/site-packages/transformers/pytorch_utils.py)
# 
# The above exception was the direct cause of the following exception:
# 
# Traceback (most recent call last):
#   File "/code/transformers.js/scripts/convert.py", line 16, in <module>
#     from optimum.exporters.onnx import main_export, export_models
#   File "<frozen importlib._bootstrap>", line 1055, in _handle_fromlist
#   File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1520, in __getattr__
#     module = self._get_module(self._class_to_module[name])
#   File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1532, in _get_module
#     raise RuntimeError(
# RuntimeError: Failed to import optimum.exporters.onnx.__main__ because of the following error (look up to see its traceback):
# cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/usr/local/lib/python3.9/site-packages/transformers/pytorch_utils.py)
RUN python3 -m pip install git+https://github.com/huggingface/optimum.git@b3ecb6c405b7fd5425d79483fd7dc88c0609be8e



RUN python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation-with-past"

This last step fails with:

Traceback (most recent call last):
  File "/code/transformers.js/scripts/convert.py", line 545, in <module>
    main()
  File "/code/transformers.js/scripts/convert.py", line 448, in main
    main_export(**export_kwargs)
  File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/__main__.py", line 280, in main_export
    model = TasksManager.get_model_from_task(
  File "/usr/local/lib/python3.9/site-packages/optimum/exporters/tasks.py", line 1951, in get_model_from_task
    model = model_class.from_pretrained(model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3644, in from_pretrained
    model, loading_info = load_tf2_checkpoint_in_pytorch_model(
  File "/usr/local/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 524, in load_tf2_checkpoint_in_pytorch_model
    tf_model_class = getattr(transformers, tf_model_class_name)
  File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1503, in __getattr__
    raise AttributeError(f"module {self.__name__} has no attribute {name}")
AttributeError: module transformers has no attribute TFPhiForCausalLM

I'll pick up the investigation thread later this week. Thanks for all the help and input so far!

Jared Van Valkengoed · Answer 5 · Fri May 10 2024 13:21:50 GMT+0800 (China Standard Time)

@xenova - I have ran into this using

TinyLlama-1.1B-Chat-v1.0
Qwen1.5-0.5B-Chat

and those are just a few to list.

@thekevinscott - I am assuming you are using a version of Python lower than 3.8? If so may I advise upgrading / using an upgraded environment for running Transformers (Python)?

Seem's you are running into issue models on-top of just general Transformer's errors that should be solved via a Torch & Python upgrade (as far as I know).

Cheers

Edit: seen you are using 3.9. Try upgrading / purging both Torch + Transformers.

Kevin Scott · Answer 6 · Fri May 10 2024 22:25:59 GMT+0800 (China Standard Time)

I've landed on a working implementation here:

https://github.com/thekevinscott/reproducing-phi-1-5-conversion

This appears to convert Phi 1.5 successfully from the original repository.

To summarize the issues I ran into along the way:

The task to use is text-generation-with-past. I don't see this documented anywhere (other than this thread)?
Some dependencies in requirements.txt are out of date, or maybe just waiting on a publish to PyPI?
The output model's onnx files need to be manually renamed. I don't believe this is documented anywhere.

It would be amazing if the Hugging Face model cards contained some of this information on the necessary steps to reproduce model conversions - that way more people could help contributing new models for this awesome library.

Cheers to both of you for your help. I'll leave this issue open since it sounds like @MarketingPip also has some issues, but feel free to close since my original query is now solved.