PINTO0309 / onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error when onnx2tf: Floating point exception (core dumped)

MuffinTopSJY opened this issue · comments

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.5

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.15.0

Download URL for ONNX

https://github.com/fabio-sim/LightGlue-ONNX/releases/download/v1.0.0/superpoint_lightglue_end2end_fused_cpu.onnx

Parameter Replacement JSON

-

Description

I convert the LightGlue model from Pytorch to Onnx and the inferences are both successful.

# image0, image1 are both: np.ndarray (1, 1, 512, 512) in [0,1], dtype=float32 
session = onnxruntime.InferenceSession("superpoint_lightglue_end2end_fused_cpu.onnx")
onnx_output = session.run(['kpts0', 'kpts1', 'matches0', 'mscores0'], {'image0':image0, 'image1':image1})
print("[ONNX] Model Outputs:", [o.name for o in session.get_outputs()])
print("[ONNX] Model Predictions:", onnx_output)

And the outputs of the Onnx model are listed below:

[ONNX] Model Outputs: ['kpts0', 'kpts1', 'matches0', 'mscores0']
[ONNX] Model Predictions: 
[array([[[214,   8],
        [326,   8],
        [424,   8],
        ...,
        [319, 501],
        [329, 503],
        [425, 503]]], dtype=int64), 
array([[[ 66,   8],
        [199,   8],
        [301,   8],
        ...,
        [474, 497],
        [500, 497],
        [278, 503]]], dtype=int64), 
array([[ 117,    5],
       [ 122,    8],
       [ 129,    2],
       ...,
       [1319, 1110],
       [1320, 1114],
       [1322, 1112]], dtype=int64), 
array([0.66974837, 0.9046258 , 0.96944857, 0.90069985, 0.94022113,
       0.9269928 , 0.9782594 , 0.18117692, 0.9418314 , 0.9777683 ,
       ... , 
       0.6197381 , 0.32578984, 0.90328395, 0.2957324 , 0.59147465,
       0.9046468 , 0.8206571 , 0.78984714, 0.8549151 , 0.75364023],
      dtype=float32)]

But when I try converting the Onnx to TFLite, this error occurs:

Floating point exception (core dumped).

I get NO MORE information but ONE LINE about this error.
'superpoint_lightglue_end2end_fused_cpu.onnx' is supported online. To avoid the influence of the version of the Onnx or else, I converted the Pytorch model to Onnx model again following your requirements, but still got the same error.
I don't know why it occurs and how to deal with it because no such error occurs when running the Onnx model.

Sincerely thank you for your time.

btw, I used code below to convert onnx2tf.

onnx2tf.convert(
    input_onnx_file_path="superpoint_lightglue_end2end_fused_cpu.onnx",
    output_folder_path="model.tf",
    copy_onnx_input_output_names_to_tflite=True,
    non_verbose=True,
)
  1. The input shape should be fixed.
    e.g.

    onnx2tf \
    -i superpoint_lightglue_end2end_fused_cpu.onnx \
    -ois image0:1,1,240,320 image1:1,1,480,640
  2. Stop using NonZero and replace it with another OP, onnx2tf can't determine the correct channel location where NCHW should be transposed to NHWC. The number of multiple input/output elements in the OP after NonZero cannot be determined, making the conversion technically difficult.

    image
    image

My particular implementation without NonZero.

many thanks!I'll try it again.

Hi, sorry to bother you again.

Following https://github.com/PINTO0309/LightGlue-ONNX, I tried converting Lightglue to Onnx model by python export.py --img_size 512 512 --lightglue_path weights/sjy_fused_static.onnx --end2end, and the Onnx model worked in my test.

When converting by onnx2tf -i sjy_fused_static.onnx or onnx2tf -i sjy_fused_static.onnx -ois image0:1,1,512,512 image1:1,1,512,512, (in superpoint.py, I set top_num = 300) I got this error:

INFO: 1409 / 3391
INFO: onnx_op_type: Expand onnx_op_name: /lightglue/posenc/Expand
INFO:  input_name.1: /lightglue/posenc/Unsqueeze_3_output_0 shape: [2, 1, 1, 300, 32, 1] dtype: float32
INFO:  input_name.2: /lightglue/posenc/Where_output_0 shape: [6] dtype: int64
INFO:  output_name.1: /lightglue/posenc/Expand_output_0 shape: [2, 1, 1, 300, 32, 2] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 310, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 383, in inverted_operation_enable_disable_wrapper_func
    result = func(*args, **kwargs)
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 53, in get_replacement_parameter_wrapper_func
    func(*args, **kwargs)
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/onnx2tf/ops/Expand.py", line 118, in make_node
    expanded_tensor = input_tensor * ones
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/keras/src/layers/core/tf_op_layer.py", line 119, in handle
    return TFOpLambda(op)(*args, **kwargs)
  File "/home/feiluo/.conda/envs/onnx2tf/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.math.multiply_46" (type TFOpLambda).

Dimensions must be equal, but are 32 and 2 for '{{node tf.math.multiply_46/Mul}} = Mul[T=DT_FLOAT](Placeholder, tf.math.multiply_46/Mul/y)' with input shapes: [1,2,1,300,32,1], [1,1,1,1,2,1].

Call arguments received by layer "tf.math.multiply_46" (type TFOpLambda):
  • x=tf.Tensor(shape=(1, 2, 1, 300, 32, 1), dtype=float32)
  • y=tf.Tensor(shape=(1, 1, 1, 1, 2, 1), dtype=float32)
  • name=None

The structure diagram of sjy_fused_static.onnx related to onnx_op_name: /lightglue/posenc/Expand might be like this:

image

Besides, I have doubts about these results, the output.shape of tf doesn't match the output of onnx. I don't know if it matters, for example:

INFO: 1401 / 3391
INFO: onnx_op_type: Concat onnx_op_name: /lightglue/posenc/Concat
INFO:  input_name.1: /lightglue/posenc/Unsqueeze_output_0 shape: [1, 1, 300, 32] dtype: float32
INFO:  input_name.2: /lightglue/posenc/Unsqueeze_1_output_0 shape: [1, 1, 300, 32] dtype: float32
INFO:  output_name.1: /lightglue/posenc/Concat_output_0 shape: [2, 1, 300, 32] dtype: float32
INFO: tf_op_type: concat
INFO:  input.1.input0: name: tf.reshape_36/Reshape:0 shape: (1, 1, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.input1: name: tf.reshape_37/Reshape:0 shape: (1, 1, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.3.axis: val: 0 
INFO:  output.1.output: name: tf.concat_29/concat:0 shape: (1, 2, 300, 32) dtype: <dtype: 'float32'> 

INFO: 1402 / 3391
INFO: onnx_op_type: Concat onnx_op_name: /lightglue/posenc_1/Concat
INFO:  input_name.1: /lightglue/posenc_1/Unsqueeze_output_0 shape: [1, 1, 300, 32] dtype: float32
INFO:  input_name.2: /lightglue/posenc_1/Unsqueeze_1_output_0 shape: [1, 1, 300, 32] dtype: float32
INFO:  output_name.1: /lightglue/posenc_1/Concat_output_0 shape: [2, 1, 300, 32] dtype: float32
INFO: tf_op_type: concat
INFO:  input.1.input0: name: tf.reshape_38/Reshape:0 shape: (1, 1, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.input1: name: tf.reshape_39/Reshape:0 shape: (1, 1, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.3.axis: val: 0 
INFO:  output.1.output: name: tf.concat_35/concat:0 shape: (1, 2, 300, 32) dtype: <dtype: 'float32'> 

INFO: 1405 / 3391
INFO: onnx_op_type: Unsqueeze onnx_op_name: /lightglue/posenc/Unsqueeze_3
INFO:  input_name.1: /lightglue/posenc/Concat_output_0 shape: [2, 1, 300, 32] dtype: float32
INFO:  input_name.2: 4856 shape: [2] dtype: int64
INFO:  output_name.1: /lightglue/posenc/Unsqueeze_3_output_0 shape: [2, 1, 1, 300, 32, 1] dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: tf.concat_29/concat:0 shape: (1, 2, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: val: [1, 2, 1, 300, 32, 1] 
INFO:  output.1.output: name: tf.reshape_40/Reshape:0 shape: (1, 2, 1, 300, 32, 1) dtype: <dtype: 'float32'> 

INFO: 1406 / 3391
INFO: onnx_op_type: Unsqueeze onnx_op_name: /lightglue/posenc_1/Unsqueeze_3
INFO:  input_name.1: /lightglue/posenc_1/Concat_output_0 shape: [2, 1, 300, 32] dtype: float32
INFO:  input_name.2: 4856 shape: (2,) dtype: int64
INFO:  output_name.1: /lightglue/posenc_1/Unsqueeze_3_output_0 shape: [2, 1, 1, 300, 32, 1] dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: tf.concat_35/concat:0 shape: (1, 2, 300, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: val: [1, 2, 1, 300, 32, 1] 
INFO:  output.1.output: name: tf.reshape_41/Reshape:0 shape: (1, 2, 1, 300, 32, 1) dtype: <dtype: 'float32'> 

I tried to figure it out but I'm sooo stupid. T_T
Sincerely thank you for your time again.

No solution to add, but confirmation that I am seeing the same issue with top_num = 256:

ValueError: Exception encountered when calling layer "tf.math.multiply_44" (type TFOpLambda).

Dimensions must be equal, but are 32 and 2 for '{{node tf.math.multiply_44/Mul}} = Mul[T=DT_FLOAT](Placeholder, tf.math.multiply_44/Mul/y)' with input shapes: [1,2,1,256,32,1], [1,1,1,1,2,1].

Call arguments received by layer "tf.math.multiply_44" (type TFOpLambda):
  • x=tf.Tensor(shape=(1, 2, 1, 256, 32, 1), dtype=float32)
  • y=tf.Tensor(shape=(1, 1, 1, 1, 2, 1), dtype=float32)
  • name=None

Really appreciate this thread. I've been trying different approaches to get LightGlue to TFLite for a while now, and this is as close as I've gotten.

No solution to add, but confirmation that I am seeing the same issue with top_num = 256:

ValueError: Exception encountered when calling layer "tf.math.multiply_44" (type TFOpLambda).

Dimensions must be equal, but are 32 and 2 for '{{node tf.math.multiply_44/Mul}} = Mul[T=DT_FLOAT](Placeholder, tf.math.multiply_44/Mul/y)' with input shapes: [1,2,1,256,32,1], [1,1,1,1,2,1].

Call arguments received by layer "tf.math.multiply_44" (type TFOpLambda):
  • x=tf.Tensor(shape=(1, 2, 1, 256, 32, 1), dtype=float32)
  • y=tf.Tensor(shape=(1, 1, 1, 1, 2, 1), dtype=float32)
  • name=None

Really appreciate this thread. I've been trying different approaches to get LightGlue to TFLite for a while now, and this is as close as I've gotten.

hi sir, sry to bother you, have you solved this problem?

If there is no activity within the next two days, this issue will be closed automatically.

Confirmation was delayed because time was spent optimizing other models.

There are too many dimensions. Automatic conversion by onnx2tf has its limitations. The following procedure should be used to correct it to the correct dimension, but the structure is too complex to devote sufficient time to it.

https://github.com/PINTO0309/onnx2tf?tab=readme-ov-file#parameter-replacement

If there is no activity within the next two days, this issue will be closed automatically.

Duplicate of : #569