"GEMM: Dimension mismatch" during value validation phase

Question

"GEMM: Dimension mismatch" during value validation phase

josephrocca opened this issue 6 months ago · comments

josephrocca commented 6 months ago

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.7

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.15.0.post1

Download URL for ONNX

Model URL: https://huggingface.co/rocca/scratch/resolve/main/_pi.original.onnx?download=true

Parameter Replacement JSON

N/A

Description

Problem:

There are some warnings/errors in the logs during validation checking:

2024-01-19 07:04:54.721680127 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384
WARNING: The accuracy error measurement process was skipped because the standard onnxruntime contains OPs that cannot be inferred.
WARNING: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384

I'm not sure if these are just errors with the validation code, or an error in the model conversion itself. I am going to investigate this further and will post anything I learn to this thread.

Command:

onnx2tf -i _pi.original.onnx -osd -cotof

Full Output:

Model optimizing started ============================================================
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                    ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant           │ 10             │ 10               │
│ Gemm               │ 3              │ 3                │
│ LayerNormalization │ 2              │ 2                │
│ Mul                │ 2              │ 2                │
│ Softplus           │ 2              │ 2                │
│ Tanh               │ 2              │ 2                │
│ Model Size         │ 893.2KiB       │ 893.2KiB         │
└────────────────────┴────────────────┴──────────────────┘

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                    ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant           │ 10             │ 10               │
│ Gemm               │ 3              │ 3                │
│ LayerNormalization │ 2              │ 2                │
│ Mul                │ 2              │ 2                │
│ Softplus           │ 2              │ 2                │
│ Tanh               │ 2              │ 2                │
│ Model Size         │ 893.2KiB       │ 893.2KiB         │
└────────────────────┴────────────────┴──────────────────┘

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                    ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Constant           │ 10             │ 10               │
│ Gemm               │ 3              │ 3                │
│ LayerNormalization │ 2              │ 2                │
│ Mul                │ 2              │ 2                │
│ Softplus           │ 2              │ 2                │
│ Tanh               │ 2              │ 2                │
│ Model Size         │ 893.2KiB       │ 893.2KiB         │
└────────────────────┴────────────────┴──────────────────┘

Model optimizing complete!

Automatic generation of each OP name started ========================================
Automatic generation of each OP name complete!

Model loaded ========================================================================

Model conversion started ============================================================
INFO: input_op_name: input shape: ['n', 'm'] dtype: float32
2024-01-19 07:04:53.015331763 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384
WARNING: The optimization process for shape estimation is skipped because it contains OPs that cannot be inferred by the standard onnxruntime.
WARNING: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384

INFO: 2 / 12
INFO: onnx_op_type: Gemm onnx_op_name: wa/0/Gemm
INFO:  input_name.1: input shape: ['n', 'm'] dtype: float32
INFO:  input_name.2: 0.weight shape: [384, 128] dtype: float32
INFO:  input_name.3: 0.bias shape: [384] dtype: float32
INFO:  output_name.1: wa/0/Gemm_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.x: name: Placeholder:0 shape: (None, None) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (128, 384) dtype: <dtype: 'float32'> 
INFO:  input.3.z: shape: (384,) dtype: <dtype: 'float32'> 
INFO:  input.4.alpha: val: 1.0 
INFO:  input.5.beta: val: 1.0 
INFO:  output.1.output: name: tf.__operators__.add/AddV2:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 3 / 12
INFO: onnx_op_type: LayerNormalization onnx_op_name: wa/0/ln/LayerNormalization
INFO:  input_name.1: wa/0/Gemm_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: 0.ln.weight shape: [384] dtype: float32
INFO:  input_name.3: 0.ln.bias shape: [384] dtype: float32
INFO:  output_name.1: wa/0/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: LayerNormalization
INFO:  input.1.input: name: tf.__operators__.add/AddV2:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.scale: shape: (384,) dtype: float32 
INFO:  input.3.bias: shape: (384,) dtype: float32 
INFO:  input.4.axis: val: 1 
INFO:  input.5.epsilon: val: 9.999999747378752e-06 
INFO:  input.6.stash_type: val: True 
INFO:  output.1.output: name: layer_normalization/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 4 / 12
INFO: onnx_op_type: Softplus onnx_op_name: wa/0/act/Softplus
INFO:  input_name.1: wa/0/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/0/act/Softplus_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: softplus
INFO:  input.1.x: name: layer_normalization/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.softplus/Softplus:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 5 / 12
INFO: onnx_op_type: Tanh onnx_op_name: wa/0/act/Tanh
INFO:  input_name.1: wa/0/act/Softplus_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/0/act/Tanh_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: tanh
INFO:  input.1.x: name: tf.math.softplus/Softplus:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.tanh/Tanh:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 6 / 12
INFO: onnx_op_type: Mul onnx_op_name: wa/0/act/Mul
INFO:  input_name.1: wa/0/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: wa/0/act/Tanh_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/0/act/Mul_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: layer_normalization/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.y: name: tf.math.tanh/Tanh:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.multiply_2/Mul:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 7 / 12
INFO: onnx_op_type: Gemm onnx_op_name: wa/1/Gemm
INFO:  input_name.1: wa/0/act/Mul_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: 1.weight shape: [384, 384] dtype: float32
INFO:  input_name.3: 1.bias shape: [384] dtype: float32
INFO:  output_name.1: wa/1/Gemm_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.x: name: Placeholder:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (384, 384) dtype: <dtype: 'float32'> 
INFO:  input.3.z: shape: (384,) dtype: <dtype: 'float32'> 
INFO:  input.4.alpha: val: 1.0 
INFO:  input.5.beta: val: 1.0 
INFO:  output.1.output: name: tf.__operators__.add_1/AddV2:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 8 / 12
INFO: onnx_op_type: LayerNormalization onnx_op_name: wa/1/ln/LayerNormalization
INFO:  input_name.1: wa/1/Gemm_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: 1.ln.weight shape: [384] dtype: float32
INFO:  input_name.3: 1.ln.bias shape: [384] dtype: float32
INFO:  output_name.1: wa/1/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: LayerNormalization
INFO:  input.1.input: name: tf.__operators__.add_1/AddV2:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.scale: shape: (384,) dtype: float32 
INFO:  input.3.bias: shape: (384,) dtype: float32 
INFO:  input.4.axis: val: 1 
INFO:  input.5.epsilon: val: 9.999999747378752e-06 
INFO:  input.6.stash_type: val: True 
INFO:  output.1.output: name: layer_normalization_1/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 9 / 12
INFO: onnx_op_type: Softplus onnx_op_name: wa/1/act/Softplus
INFO:  input_name.1: wa/1/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/1/act/Softplus_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: softplus
INFO:  input.1.x: name: layer_normalization_1/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.softplus_1/Softplus:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 10 / 12
INFO: onnx_op_type: Tanh onnx_op_name: wa/1/act/Tanh
INFO:  input_name.1: wa/1/act/Softplus_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/1/act/Tanh_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: tanh
INFO:  input.1.x: name: tf.math.softplus_1/Softplus:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.tanh_1/Tanh:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 11 / 12
INFO: onnx_op_type: Mul onnx_op_name: wa/1/act/Mul
INFO:  input_name.1: wa/1/ln/LayerNormalization_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: wa/1/act/Tanh_output_0 shape: ['n', 384] dtype: float32
INFO:  output_name.1: wa/1/act/Mul_output_0 shape: ['n', 384] dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: layer_normalization_1/batchnorm/add_1:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.y: name: tf.math.tanh_1/Tanh:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.multiply_5/Mul:0 shape: (None, 384) dtype: <dtype: 'float32'> 

INFO: 12 / 12
INFO: onnx_op_type: Gemm onnx_op_name: wa/2/Gemm
INFO:  input_name.1: wa/1/act/Mul_output_0 shape: ['n', 384] dtype: float32
INFO:  input_name.2: 2.weight shape: [76, 384] dtype: float32
INFO:  input_name.3: 2.bias shape: [76] dtype: float32
INFO:  output_name.1: output shape: ['n', 76] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.x: name: Placeholder:0 shape: (None, 384) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (384, 76) dtype: <dtype: 'float32'> 
INFO:  input.3.z: shape: (76,) dtype: <dtype: 'float32'> 
INFO:  input.4.alpha: val: 1.0 
INFO:  input.5.beta: val: 1.0 
INFO:  output.1.output: name: tf.__operators__.add_2/AddV2:0 shape: (None, 76) dtype: <dtype: 'float32'> 

saved_model output started ==========================================================
saved_model output complete!
Summary on the non-converted ops:
---------------------------------
 * Accepted dialects: tfl, builtin, func
 * Non-Converted Ops: 16, Total Ops 58, % non-converted = 27.59 %
 * 16 ARITH ops

- arith.constant:   16 occurrences  (f32: 12, i32: 4)



  (f32: 6)
  (f32: 2)
  (f32: 3)
  (f32: 2)
  (f32: 4)
  (f32: 8)
  (i32: 1)
  (f32: 3)
  (f32: 2)
  (i32: 1)
  (f32: 2)
  (i32: 1)
  (f32: 2)
  (f32: 2)
Float32 tflite output complete!
Summary on the non-converted ops:
---------------------------------
 * Accepted dialects: tfl, builtin, func
 * Non-Converted Ops: 16, Total Ops 70, % non-converted = 22.86 %
 * 16 ARITH ops

- arith.constant:   16 occurrences  (f16: 12, i32: 4)



  (f32: 6)
  (f32: 12)
  (f32: 2)
  (f32: 3)
  (f32: 2)
  (f32: 4)
  (f32: 8)
  (i32: 1)
  (f32: 3)
  (f32: 2)
  (i32: 1)
  (f32: 2)
  (i32: 1)
  (f32: 2)
  (f32: 2)
Float16 tflite output complete!

ONNX and TF output value validation started =========================================
INFO: validation_conditions: np.allclose(onnx_outputs, tf_outputs, rtol=0.0, atol=0.0001, equal_nan=True)
2024-01-19 07:04:54.721680127 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384
WARNING: The accuracy error measurement process was skipped because the standard onnxruntime contains OPs that cannot be inferred.
WARNING: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gemm node. Name:'wa/0/Gemm' Status Message: GEMM: Dimension mismatch, W: {384,128} K: 1 N:384

Katsuya Hyodo · Answer 1 · Fri Jan 19 2024 15:20:25 GMT+0800 (China Standard Time)

I will not go along with the onnxruntime issue.

sit4onnx -if _pi.original.onnx -oep cpu -fs 384 128

INFO: file: _pi.original.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: input shape: [384, 128] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  9.15980339050293 ms
INFO: avg elapsed time per pred:  0.915980339050293 ms
INFO: output_name.1: output shape: [384, 76] dtype: float32

onnx2tf -i _pi.original.onnx -cotof -ois input:384,128

josephrocca · Answer 2 · Fri Jan 19 2024 15:26:29 GMT+0800 (China Standard Time)

Ah, apologies, I think I understand: n*m implies any n and any m, but actually it is not true (the model only supports specific values of n and m), and onnx2tf cannot (simply) know this, hence the error. So manually specifying a valid input to onnx2tf is required. Thank you!

Katsuya Hyodo · Answer 3 · Fri Jan 19 2024 15:27:30 GMT+0800 (China Standard Time)

That is correct.

josephrocca · Answer 4 · Fri Jan 19 2024 15:29:34 GMT+0800 (China Standard Time)

(Thank you so much for your open source work - you have created incredibly valuable tools.)