microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Home Page:https://onnxruntime.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error No Op registered for SimplifiedLayerNormalization with domain_version of 14

anpin opened this issue · comments

Describe the issue

Not able to run Phi3 CUDA version. CPU/Mobile works as expected.

> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx
2024-05-11 10:27:29.056205252 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
  File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
    exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
    sess = onnxrt.InferenceSession(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14

> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx
2024-05-11 10:27:46.458111088 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
  File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
    exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
    sess = onnxrt.InferenceSession(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14

To reproduce

huggingface-cli download microsoft/Phi-3-mini-128k-instruct-onnx --include "cuda/*" --local-dir Phi-3-mini-128k-instruct-onnx
curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
python phi3-qa.py -m Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32

Urgency

No response

Platform

Linux

OS Version

NixOS 24.05.20240508.8892ecd (Uakari) x86_64

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.17.3

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

12.4

possibly related to #7573

I think I might be missing onnxruntime-gpu package

Since I'm coming from the dotnet background and aforementioned package is not yet available for nixos I decided to explore running phi3 in a dotnet project with Microsoft.ML.OnnxRuntimeGenAI.Cuda, but if throws the same error:

> dotnet run
-------------
Hello, Phi-3!
-------------
Unhandled exception. Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: Load model from /home/a/projects/phi3/Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14
   at Microsoft.ML.OnnxRuntimeGenAI.Model..ctor(String modelPath)
   at Program.main(String[] argv) in /home/a/projects/phi3/Program.fs:line 20

Now I'm wondering if I opened the issue in the right repo or if should be moved over to onnxruntime-genai.

On nixos onnxruntime withcuda is packaged using both onnxruntime_USE_CUDA and onnxruntime_DISABLE_CONTRIB_OPS, which could be the reason. Trying to override that.

indeed removing onnxruntime_DISABLE_CONTRIB_OPS from the nixos derivation resolved the issue