onnx2tf conversion error for gpt2 transformers model

Question

onnx2tf conversion error for gpt2 transformers model

flores-o opened this issue 4 months ago · comments

Oana Florescu commented 4 months ago

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.11

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.16.1

Download URL for ONNX

https://huggingface.co/openai-community/gpt2/resolve/main/onnx/decoder_model.onnx?download=true

Parameter Replacement JSON

{
    "format_version": 1,
    "operations": [
      {
        "op_name": "Shape",
        "param_target": "inputs",
        "param_name": "input_ids",
        "values": [1, 128] 
      },
      {
        "op_name": "wa/transformer/Shape",
        "param_target": "inputs",
        "param_name": "input_ids",
        "values": [1, 128] 
      }
      
    ]
  }

Description

To reproduce the error, can run the following notebook
https://colab.research.google.com/drive/1pBnJpC4613cUoNl-k_rdHeGm3pnFCuds?usp=sharing

Purpose Research & Product Development. Thank you for creating and maintaining the repoo ! <3
What When running !onnx2tf -i /content/model.onnx, I get the following errors (Full error logs in colab)

INFO: 3 / 1625
INFO: onnx_op_type: Shape onnx_op_name: /transformer/Shape
INFO:  input_name.1: input_ids shape: ['batch_size', 'sequence_length'] dtype: int64
INFO:  output_name.1: /transformer/Shape_output_0 shape: [2] dtype: int64
ERROR: The trace log is below.

ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces `keras.layers` and `keras.operations`). You are likely doing something like:

How I tried variations of the command for dealing with dynamic input sizes using -ois -kat -prf options
Why To compare performance of in-browser inference when running transformer models using onnx runtime web, tensorflow js, web llm, etc
Resources Closed issues in onnx2tf referencing the use of param_replacement.json. Noob here, any suggestions for updating the param_replacement.json file are appreciated

Katsuya Hyodo · Answer 1 · Fri Mar 29 2024 08:42:46 GMT+0800 (China Standard Time)

onnx2tf -i decoder_model.onnx -b 1 -osd

  :
  :
INFO: 1613 / 1613
INFO: onnx_op_type: MatMul onnx_op_name: /lm_head/MatMul
INFO:  input_name.1: /transformer/Reshape_3_output_0 shape: None dtype: float32
INFO:  input_name.2: onnx::MatMul_3718 shape: [768, 50257] dtype: float32
INFO:  output_name.1: logits shape: ['batch_size', 'sequence_length', 50257] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.a: name: tf.reshape_367/Reshape:0 shape: (None, None, None) dtype: <dtype: 'float32'> 
INFO:  input.2.b: shape: (768, 50257) dtype: float32 
INFO:  input.3.output_type: name: float32 shape: () 
INFO:  output.1.output: name: tf.linalg.matmul_72/MatMul:0 shape: (None, None, 50257) dtype: <dtype: 'float32'> 

saved_model output started ==========================================================
saved_model output complete!
loc(fused["SelectV2:", callsite("model_39/tf.where/SelectV2"("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py":284:1) at callsite("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py":308:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/framework/func_graph.py":1059:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py":597:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py":41:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py":1162:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/autograph/core/function_wrappers.py":113:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py":1162:1 at callsite("/home/b920405/.local/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py":65:1 at "/home/b920405/.local/lib/python3.10/site-packages/keras/src/engine/training.py":589:1)))))))))]): error: 'tf.SelectV2' op operands don't have broadcast-compatible shapes
Traceback (most recent call last):
  File "/home/b920405/.local/bin/onnx2tf", line 8, in <module>
    sys.exit(main())
  File "/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py", line 2326, in main
    model = convert(
  File "/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py", line 1250, in convert
    tflite_model = converter.convert()
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/lite.py", line 2171, in convert
    return super(TFLiteConverterV2, self).convert()
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/lite.py", line 1125, in wrapper
    return self._convert_and_export_metrics(convert_func, *args, **kwargs)
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/lite.py", line 1079, in _convert_and_export_metrics
    result = convert_func(self, *args, **kwargs)
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/lite.py", line 1778, in convert
    return super(TFLiteFrozenGraphConverterV2, self).convert(
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/lite.py", line 1357, in convert
    result = _convert_graphdef(
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/convert_phase.py", line 212, in wrapper
    raise converter_error from None  # Re-throws the exception.
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
    return func(*args, **kwargs)
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/convert.py", line 978, in convert_graphdef
    data = convert(
  File "/home/b920405/.local/lib/python3.10/site-packages/tensorflow/lite/python/convert.py", line 366, in convert
    raise converter_error
tensorflow.lite.python.convert_phase.ConverterError: /home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:284:1: error: 'tf.SelectV2' op operands don't have broadcast-compatible shapes
        concrete_function = _create_concrete_function(
^
/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:308:1: note: called from
  traced_func_graph = func_graph_module.func_graph_from_py_func(
^
/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/framework/func_graph.py:1059:1: note: called from
    func_outputs = python_func(*func_args, **func_kwargs)
^
/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:597:1: note: called from
          out = weak_wrapped_fn().__wrapped__(*args, **kwds)
^
/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41:1: note: called from
      return api.converted_call(
^
/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py:1162:1: note: called from
        run_model = tf.function(lambda *inputs : model(inputs))
^
/home/b920405/.local/lib/python3.10/site-packages/tensorflow/python/autograph/core/function_wrappers.py:113:1: note: called from
    return thunk(scope)
^
/home/b920405/.local/lib/python3.10/site-packages/onnx2tf/onnx2tf.py:1162:1: note: called from
        run_model = tf.function(lambda *inputs : model(inputs))
^
/home/b920405/.local/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:65:1: note: called from
            return fn(*args, **kwargs)
^
/home/b920405/.local/lib/python3.10/site-packages/keras/src/engine/training.py:589:1: note: called from
        return super().__call__(*args, **kwargs)
^

saved_model_cli show --dir saved_model/ --all

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['attention_mask'] tensor_info:
        dtype: DT_INT64
        shape: (1, -1)
        name: serving_default_attention_mask:0
    inputs['input_ids'] tensor_info:
        dtype: DT_INT64
        shape: (1, -1)
        name: serving_default_input_ids:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['logits'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 50257)
        name: PartitionedCall:0
    outputs['present.0.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:1
    outputs['present.0.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:2
    outputs['present.1.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:3
    outputs['present.1.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:4
    outputs['present.10.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:5
    outputs['present.10.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:6
    outputs['present.11.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:7
    outputs['present.11.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:8
    outputs['present.2.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:9
    outputs['present.2.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:10
    outputs['present.3.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:11
    outputs['present.3.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:12
    outputs['present.4.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:13
    outputs['present.4.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:14
    outputs['present.5.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:15
    outputs['present.5.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:16
    outputs['present.6.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:17
    outputs['present.6.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:18
    outputs['present.7.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:19
    outputs['present.7.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:20
    outputs['present.8.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:21
    outputs['present.8.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:22
    outputs['present.9.key'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:23
    outputs['present.9.value'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 64, 12)
        name: PartitionedCall:24
  Method name is: tensorflow/serving/predict
The MetaGraph with tag set ['serve'] contains the following ops: {'Fill', 'Range', 'RealDiv', 'Cast', 'Mul', 'TensorScatterUpdate', 'SplitV', 'Transpose', 'BatchMatMulV2', 'StaticRegexFullMatch', 'ConcatV2', 'Sub', 'ExpandDims', 'Pack', 'MergeV2Checkpoints', 'Squeeze', 'Identity', 'Less', 'SelectV2', 'Softmax', 'Const', 'RestoreV2', 'SaveV2', 'Mean', 'GatherV2', 'ShardedFilename', 'Shape', 'PartitionedCall', 'Placeholder', 'StatefulPartitionedCall', 'Sqrt', 'StringJoin', 'StridedSlice', 'Select', 'NoOp', 'AddV2', 'MatMul', 'Reshape', 'Tanh'}

Oana Florescu · Answer 2 · Fri Mar 29 2024 09:24:22 GMT+0800 (China Standard Time)

hi @PINTO0309 thank you for the prompt response!

When running !onnx2tf -i model.onnx -b 1 -osd

I get the same error as before

ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces `keras.layers` and `keras.operations`). You are likely doing something like:

ERROR: input_onnx_file_path: model.onnx
ERROR: onnx_op_name: wa/transformer/Shape
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement

Can you share your environment settings? Maybe you're using a different onnx2tf/library version