Error No Op registered for SimplifiedLayerNormalization with domain_version of 14
anpin opened this issue · comments
Describe the issue
Not able to run Phi3 CUDA version. CPU/Mobile works as expected.
> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx
2024-05-11 10:27:29.056205252 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
sys.exit(main())
^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
sess = onnxrt.InferenceSession(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14
> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx
2024-05-11 10:27:46.458111088 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
sys.exit(main())
^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
sess = onnxrt.InferenceSession(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14
To reproduce
huggingface-cli download microsoft/Phi-3-mini-128k-instruct-onnx --include "cuda/*" --local-dir Phi-3-mini-128k-instruct-onnx
curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
python phi3-qa.py -m Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32
Urgency
No response
Platform
Linux
OS Version
NixOS 24.05.20240508.8892ecd (Uakari) x86_64
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.3
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
12.4
possibly related to #7573
I think I might be missing onnxruntime-gpu
package
Since I'm coming from the dotnet background and aforementioned package is not yet available for nixos I decided to explore running phi3 in a dotnet project with Microsoft.ML.OnnxRuntimeGenAI.Cuda
, but if throws the same error:
> dotnet run
-------------
Hello, Phi-3!
-------------
Unhandled exception. Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: Load model from /home/a/projects/phi3/Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14
at Microsoft.ML.OnnxRuntimeGenAI.Model..ctor(String modelPath)
at Program.main(String[] argv) in /home/a/projects/phi3/Program.fs:line 20
Now I'm wondering if I opened the issue in the right repo or if should be moved over to onnxruntime-genai.
On nixos onnxruntime
withcuda
is packaged using both onnxruntime_USE_CUDA
and onnxruntime_DISABLE_CONTRIB_OPS
, which could be the reason. Trying to override that.
indeed removing onnxruntime_DISABLE_CONTRIB_OPS
from the nixos derivation resolved the issue