microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MXNet convertorToIR don't parse 'attrs' layer parameters

belgraviton opened this issue · comments

Platform: Ununtu 16.04

Python version: 3.6

Source framework with version: MXNet 1.0.0 with GPU

Destination framework with version: TensorFlow 1.4.1 with GPU

Pre-trained model path: LResNet50E-IR

Model json-file contain layers in the following form:
{ "op": "Convolution", "name": "conv0", "attrs": { "kernel": "(3, 3)", "no_bias": "True", "num_filter": "64", "pad": "(1, 1)", "stride": "(1, 1)", "workspace": "256" }, "inputs": [[3, 0, 0], [4, 0, 0]] }

Running scripts:
python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d model-descr512 --inputShape 3 112 112

Error:

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 140, in <module>
    _main()
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 135, in _main
    ret = _convert(args)
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 72, in _convert
    parser.run(args.dstPath)
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/parser.py", line 22, in run
    self.gen_IR()
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 266, in gen_IR
    func(current_node)
  File "/opt/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 457, in rename_Convolution
    assert "attr" in source_node.layer or "param" in source_node.layer
AssertionError

I change functions rename_FullyConnected, rename_LeakyReLU and rename_Convolution in mxnet_parser.py in the following way:
change string
assert "attr" in source_node.layer or "attrs" in source_node.layer or "param" in source_node.layer
and add new condition
elif "attrs" in source_node.layer: layer_attr = source_node.layer["attrs"]

this solve this issue. These changes should be applied to all mxnet functions.

Hi @belgraviton , cool! I fixed it and checked it in. Could you help to try if it works in newest code now? Thanks!

I use already implemented '_get_layer_attr' function in all 'rename' functions.
Now it's OK

Appreciate what you have done!

I'm trying to use @belgraviton PR to convert the same model LResNet50E-IR but I am getting the following error:

Warning: MXNet Parser has not supported operator null with name data.
Warning: convert the null operator with name [data] into input layer.
Warning: MXNet Parser has not supported operator _minus_scalar with name _minusscalar0.
Warning: MXNet Parser has not supported operator _mul_scalar with name _mulscalar0.
Traceback (most recent call last):  
  File "/home/daniel/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
      "__main__", mod_spec)  
  File "/home/daniel/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
      exec(code, run_globals)
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 140, in <module>
      _main()  
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 135, in _main
      ret = _convert(args)  
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 72, in _convert
      parser.run(args.dstPath)  
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/parser.py", line 22, in run
      self.gen_IR()  
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 273, in gen_IR
      func(current_node)  
  File "/home/daniel/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 478, in rename_Convolution
    in_channel = self.IR_layer_map[IR_node.input[0]].attr["_output_shapes"].list.shape[0].dim[-1].size
KeyError: '_mulscalar0

Hi @d4nst , there are two operators not support by current framework.

Warning: MXNet Parser has not supported operator _minus_scalar with name _minusscalar0.
Warning: MXNet Parser has not supported operator _mul_scalar with name _mulscalar0.

Could you provide the related files for us to implement and test it? Thanks.

I used the same model posted by @belgraviton: LResNet50E-IR

_minusscalar0 and _mulscalar0 can be replaced to _copy in json file. There are some other problems during 'LResNet50E-IR' MXNet to IR conversion. One of them is solved by PR #96

Hi @belgraviton @d4nst I am working on these two op -> IR stuff. Will ping you if it is done. Thanks.

@belgraviton @d4nst . Fixed. Please check the newest code and recent commit for detail info. But I didn't check the correctness.
Thanks.

It now runs without any errors. Thanks!

There is no errors. OK. Thank you

@d4nst @belgraviton Were you guys able to convert the IR representation to any other framework? I tried converting it (for the same model) to CoreML, but that gives vastly different outputs. When trying to convert the IR to Tensorflow, I get the following errors:

python -m mmdnn.conversion.examples.tensorflow.imagenet_test -n ir.py -w ir.npy --dump tf_ir.ckpt
/Library/Python/2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "build/bdist.macosx-10.13-intel/egg/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 71, in <module>
  File "build/bdist.macosx-10.13-intel/egg/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 20, in __init__
  File "ir.py", line 255, in KitModel
    pre_fc1         = tf.layers.dense(bn1, 512, kernel_initializer = tf.constant_initializer(__weights_dict['pre_fc1']['weights']), bias_initializer = tf.constant_initializer(__weights_dict['pre_fc1']['bias']), use_bias = True)
  File "/Library/Python/2.7/site-packages/tensorflow/python/layers/core.py", line 215, in dense
    return layer.apply(inputs)
  File "/Library/Python/2.7/site-packages/tensorflow/python/layers/base.py", line 492, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/tensorflow/python/layers/base.py", line 434, in __call__
    self.build(input_shapes[0])
  File "/Library/Python/2.7/site-packages/tensorflow/python/layers/core.py", line 118, in build
    trainable=True)
  File "/Library/Python/2.7/site-packages/tensorflow/python/layers/base.py", line 374, in add_variable
    trainable=trainable and self.trainable)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 367, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter
    use_resource=use_resource)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 725, in _get_single_variable
    validate_shape=validate_shape)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variables.py", line 200, in __init__
    expected_shape=expected_shape)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variables.py", line 278, in _init_from_args
    initial_value(), name="initial_value", dtype=dtype)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 701, in <lambda>
    shape.as_list(), dtype=dtype, partition_info=partition_info)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/init_ops.py", line 203, in __call__
    verify_shape=verify_shape)
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 428, in make_tensor_proto
    (shape_size, nparray.size))
ValueError: Too many elements provided. Needed at most 262144, but received 12845056

hi,i also use the LResNet50E-IR mxnet model,and i success to convert it to IR,and then i convert models from IR to Tensorflow code snippet use follow script like that:
python -m mmdnn.conversion._script.IRToCode -f tensorflow --IRModelPath model-descr512.pb --IRWeightPath model-descr512.npy --dstModelPath tf_resnet50.py

log show that:

Parse file [model-descr512.pb] with binary format successfully.
TensorflowEmitter has not supported operator [_copy].
id
Target network code snippet is saved as [tf_resnet50.py].

but when i convert models from IR to Tensorflow model,i met the below error:

(python3) andy@andy:~/DISK-DATA/datasets/insightface/resnet-r50-am-lfw/model-r50-am-lfw$ python -m mmdnn.conversion.examples.tensorflow.imagenet_test -n tf_resnet50.py -w model-descr512.npy --dump tf_resnet152.ckpt
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/andy/python3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 71, in <module>
    tester = TestTF()
  File "/home/andy/python3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 20, in __init__
    self.input, self.model = self.MainModel.KitModel(self.args.w)
  File "/home/andy/DISK-DATA/datasets/insightface/resnet-r50-am-lfw/model-r50-am-lfw/tf_resnet50.py", line 26, in KitModel
    _minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='_minusscalar0_second')
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 220, in constant
    name=name).outputs[0]
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
    op_def=op_def)
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1589, in __init__
    raise ValueError("'%s' is not a valid node name" % node_def.name)
ValueError: '_minusscalar0_second' is not a valid node name

for resolve the raise error, then i modified the tf_resnet50.py:

_minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='_minusscalar0_second')
_mulscalar0_second = tf.constant(__weights_dict['_mulscalar0_second']['value'], name='_mulscalar0_second')

to

_minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='const_minusscalar0_second')
_mulscalar0_second = tf.constant(__weights_dict['_mulscalar0_second']['value'], name='const_mulscalar0_second')

that help to fix the raise ValueError,but another error raise,like that:

(python3) andy@andy:~/DISK-DATA/datasets/insightface/resnet-r50-am-lfw/model-r50-am-lfw$ python -m mmdnn.conversion.examples.tensorflow.imagenet_test -n tf_resnet50.py -w model-descr512.npy --dump tf_resnet152.ckpt
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/andy/python3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 71, in <module>
    tester = TestTF()
  File "/home/andy/python3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 20, in __init__
    self.input, self.model = self.MainModel.KitModel(self.args.w)
  File "/home/andy/DISK-DATA/datasets/insightface/resnet-r50-am-lfw/model-r50-am-lfw/tf_resnet50.py", line 28, in KitModel
    minusscalar0    = id - _minusscalar0_second
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 949, in r_binary_op_wrapper
    x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 946, in convert_to_tensor
    as_ref=False)
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1036, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 235, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 433, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/home/andy/python3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 344, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected float32, got <built-in function id> of type 'builtin_function_or_method' instead.

it show that the file tf_resnet50.py have a bud,local in file:
minusscalar0 = id - _minusscalar0_second

consider all of the scripts and executed log, maybe error bring by the step “convert models from IR to Tensorflow code snippet”,and the log “TensorflowEmitter has not supported operator [_copy].
” couldn't ignore. finally,how we to fix that bug?Best wish to you.

update:

if i modifiy the code "minusscalar0 = id - _minusscalar0_second" to "minusscalar0 = data - _minusscalar0_second", that will help to solve the error, whitch 'data' is a placeholder of input, but then i met the error same as @galli-leo.

@xsr-ai I just removed the whole minus and mul scalar nodes, since they are just preprocessing the images. But yeah, stuck on the error I received then. When I get some more time, I will try and debug it, I am guessing somehow the weights are wrongly saved.

@galli-leo i add “bn1 = tf.reshape(bn1, (1, 25088))” before tf.layers.dense in the tf_resnet50.py,then fix the error “ValueError: Too many elements provided. Needed at most 262144, but received 12845056”, but when i test the converted tf model it have a poor performance,lfw Acc just reach to 0.573+-0.021,i don‘t know why the test performance so poor.

@xsr-ai We probably need to fix the weights themselves not the layer. I should get some time next week to look at this.

@galli-leo look forward to your great work.

@galli-leo hi, gali, are you get any progress?

@xsr-ai @galli-leo Could you provide your test code? I guess I know how to handle it, just try transpose the weight.
Thanks.

@kitstar we use the tensorflow model https://drive.google.com/open?id=1x0-EiYX9jMUKiq-n1Bd9OCK4fVB3a54v or https://pan.baidu.com/s/1mj6X7MK, you can look at 14 days ago which i mentioned to repeat the issue.finally, thanks you kindly help me.

@galli-leo hi, gali, are you getting any progress?

i add "bn1 = tf.reshape(bn1, [-1, 7*7*512])", which fix the error “ValueError: Too many elements provided. Needed at most 262144, but received 12845056”.

@stormand thanks!

@JiahaoYao could you try to confirm the fix and try to fix it in code? Thanks!

To sum up, I have compared the inference results of converted tensorflow and original mxnet. They are exactly the same.
The details go as follows:
First, as @belgraviton said,

python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d model-descr512 --inputShape 3 112 112

Here comes the IR structure.

IR network structure is saved as [model-descr512.pb].
IR weights are saved as [model-descr512.npy].

Secondly, just as @xsr-ai mentioned,

python -m mmdnn.conversion._script.IRToCode -f tensorflow --IRModelPath model-descr512.pb --IRWeightPath model-descr512.npy --dstModelPath tf_resnet50.py

Here comes the tensorflow snippet.

Target network code snippet is saved as [tf_resnet50.py].

However, Two things have to be modified in this snippet.

  • The "'_minusscalar0_second' is not a valid node name" issue
    The line 25-29 of the snippet is
    _mulscalar0_second = tf.constant(__weights_dict['_mulscalar0_second']['value'], name='_mulscalar0_second')
    _minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='_minusscalar0_second')
    data            = tf.placeholder(tf.float32, shape = (None, 112, 112, 3), name = 'data')
    minusscalar0    = data - _minusscalar0_second
    mulscalar0      = minusscalar0 * _mulscalar0_second

_mulscalar0_second and _minusscalar0_second should be changed, say mulscalar0_second and minusscalar0_second respectively.

Then the code should be like

    mulscalar0_second = tf.constant(__weights_dict['_mulscalar0_second']['value'], name='mulscalar0_second')
    minusscalar0_second = tf.constant(__weights_dict['_minusscalar0_second']['value'], name='minusscalar0_second')
    data            = tf.placeholder(tf.float32, shape = (None, 112, 112, 3), name = 'data')
    minusscalar0    = data - minusscalar0_second
    mulscalar0      = minusscalar0 * mulscalar0_second

In my view, this may because the node name can not begin with '_' in tensorflow.

  • The "bn1 = tf.reshape(bn1, [-1, 7x7x512])" issue.

The code (line 255-260) should be converted from

    bn1             = batch_normalization(plus23, variance_epsilon=1.99999994948e-05, name='bn1')
    pre_fc1         =... 

to

    bn1             = batch_normalization(plus23, variance_epsilon=1.99999994948e-05, name='bn1')
    bn1             = tf.reshape(bn1, [-1, 7*7*512])
    pre_fc1         = ...

In my opinion, this is because in this mxnet json, there is no flatten operator before the fc layer. That is why there is no flatten layer when converted to tensorflow. Here, we add this flatten layer to the tensorflow code.

As to the Sanity checks, I compare the inference result of the two.
You might take it as a reference.
For mxnet,

# inference with mxnet

import mxnet as mx
from tensorflow.contrib.keras.api.keras.preprocessing import image
import numpy as np

from collections import namedtuple
Batch = namedtuple('Batch', ['data'])

ctx = mx.cpu()

# load model
sym, arg_params, aux_params = mx.model.load_checkpoint('/Users/kit/Downloads/model-r50-am-lfw-n/model', 0)
mod = mx.mod.Module(symbol = sym, context= ctx, label_names= None)
mod.bind(for_training=False, data_shapes=[('data', (1, 3, 112, 112))], label_shapes= mod._label_shapes)
mod.set_params(arg_params, aux_params, allow_missing= True)


path = '/Users/kit/github/MMdnn/mmdnn/conversion/examples/data/seagull.jpg'

# load image with BGRTranspose=True
img = image.load_img(path, target_size = (112, 112))
img = image.img_to_array(img)

img = img[..., ::-1]


# channel first in mxnet
img = np.expand_dims(img, 0).transpose((0,3,1,2))

# compute the predict probabilities
mod.forward(Batch([mx.nd.array(img)]))
prob = mod.get_outputs()[0].asnumpy()
prob = np.squeeze(prob)
print(prob)

For tensorflow,

Exploit the script tensorflow_inference.py

# inference with tensorflow

from __future__ import absolute_import
import argparse
import numpy as np
from six import text_type as _text_type
from tensorflow.contrib.keras.api.keras.preprocessing import image
import tensorflow as tf


parser = argparse.ArgumentParser()

parser.add_argument('-n', type=_text_type, default='kit_imagenet',
                    help='Network structure file name.')


parser.add_argument('-w', type=_text_type, required=True,
                    help='Network weights file name')

parser.add_argument('--image', '-i',
                    type=_text_type, help='Test image path.',
                    default="mmdnn/conversion/examples/data/seagull.jpg"
)


args = parser.parse_args()
if args.n.endswith('.py'):
    args.n = args.n[:-3]

# import converted model
model_converted = __import__(args.n).KitModel(args.w)
input_tf, model_tf = model_converted

# load img with BGRTranspose=True
img = image.load_img(args.image, target_size = (112, 112))
img = image.img_to_array(img)
img = img[..., ::-1]

input_data = np.expand_dims(img, 0)

# inference with tensorflow
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    predict = sess.run(model_tf, feed_dict = {input_tf : input_data})
print(predict)

one can get the inference by running

python tensorflow_inference.py -n tf_resnet50.py -w model-descr512.npy -i /Users/kit/github/MMdnn/mmdnn/conversion/examples/data/seagull.jpg

To sum up, the two results are exactly the same,

-3.08436990e-01  5.81104577e-01  2.24989086e-01  4.96589839e-01
 -8.28284204e-01 -3.38171571e-01  5.44343829e-01  2.44809270e-01
 -1.69032320e-01 -2.68041462e-01  6.62914887e-02 -5.56380264e-02
 -2.54793257e-01 -2.25498274e-01  6.71060801e-01  4.22945946e-01
  5.25136948e-01  9.15651679e-01  4.44332063e-01  8.00531283e-02
 -1.12293661e+00 -3.05917352e-01 -8.28881800e-01 -5.28346121e-01
 -9.00689662e-01 -5.64389050e-01  8.62524271e-01  2.87447959e-01
  3.65373880e-01  2.18979344e-01 -6.14733994e-01 -4.31148186e-02
 -7.58342385e-01  9.34286341e-02 -3.50230098e-01 -9.10354555e-02
  3.37059349e-01 -3.66803199e-01  3.67519438e-01  4.60750610e-01
  1.56841412e-01 -4.93858419e-02  1.90557286e-01 -4.91944730e-01
 -1.05496459e-01  3.38450491e-01  8.76139641e-01  9.44598436e-01
  1.84234425e-01 -5.40344477e-01  2.27486148e-01  2.77883224e-02
  1.01942503e+00  3.23045045e-01  5.31853318e-01  5.14196396e-01
 -6.12801909e-01 -5.63944519e-01  9.60868120e-01  9.24057513e-02
  5.20703971e-01 -1.06362402e-01 -8.39652717e-01  4.28752959e-01
  3.16981040e-02  6.78501308e-01 -1.45745173e-01  5.32108843e-02
  9.31305140e-02 -7.69111693e-01  4.49718803e-01  7.04128623e-01
 -4.75108653e-01  1.21819067e+00 -8.08692694e-01 -2.30065078e-01
 -4.11778957e-01 -3.44921462e-02  1.72196701e-01  5.71296930e-01
  7.40399897e-01  1.93351358e-01  3.81458938e-01 -5.99029183e-01
  6.43376529e-01 -8.78583312e-01  4.53670561e-01  1.48441091e-01
 -1.01884592e+00  1.91614315e-01 -5.96112907e-01 -2.08623409e-01
  5.42311668e-01 -1.16038311e+00 -2.95781158e-02 -3.10838282e-01
  2.50876069e-01 -4.39370900e-01  7.24291056e-02  6.35423720e-01
 -4.76962119e-01 -1.59666681e+00 -2.57584065e-01  1.15012601e-02
 -5.61534464e-01  3.09056640e-01  3.51772815e-01  3.36598426e-01
 -9.82229233e-01  3.19448918e-01  3.97928298e-01 -1.96443468e-01
  3.44431281e-01 -8.81833017e-01  1.09189129e+00  5.07110655e-01
 -3.27133760e-03 -5.46444058e-01  2.45448351e-01  1.17322409e+00
  5.74339688e-01  3.71622354e-01 -6.03881657e-01 -9.92266953e-01
  4.81626421e-01  3.30736607e-01 -1.09214559e-01  5.91557436e-02
 -1.44589961e-01 -3.59522134e-01 -1.81073412e-01 -4.33940858e-01
 -1.36560336e-01  2.20874250e-01 -6.04387701e-01  4.16712105e-01
  1.41887212e+00  5.31345427e-01  3.34208578e-01  9.81603742e-01
  1.44290715e-01 -5.02448618e-01  2.54356489e-03  4.93989766e-01
  5.95363081e-01 -2.05855429e-01  5.51334262e-01 -1.76252216e-01
  1.17584452e-01 -6.89881325e-01 -2.03898609e-01  3.42189759e-01
 -5.28536886e-02 -8.34298611e-01  8.36958230e-01  7.35776961e-01
  8.24692369e-01  5.61921159e-03  2.12964252e-01 -3.55549812e-01
  3.17866206e-01  6.50376856e-01  8.65678668e-01 -2.11118603e+00
  2.42719352e-01 -3.30658764e-01 -1.09676349e+00 -6.26673341e-01
 -3.64849299e-01 -1.63232118e-01 -4.59915340e-01  1.00147128e+00
 -6.99459314e-02  8.92656922e-01 -5.75572670e-01 -7.12908283e-02
  3.39384466e-01  3.07012163e-02 -4.49067861e-01 -7.29369104e-01
  1.43405378e+00 -1.95949122e-01  1.30293414e-01  6.82015479e-01
  4.56094921e-01 -1.23712547e-01 -2.45128587e-01  2.22329989e-01
 -1.12109733e+00  1.99771658e-01 -2.67177913e-03  8.00425038e-02
  2.35564798e-01 -5.01654983e-01  3.71638000e-01 -2.35533789e-01
 -2.15010375e-01  3.69144708e-01  2.44752333e-01  1.02857959e+00
 -6.72526807e-02 -6.38509214e-01 -1.86292812e-01 -1.58930585e-01
  3.57254833e-01  1.58534482e-01  2.86537468e-01 -3.24778169e-01
 -1.45921671e+00 -1.35326970e+00  6.97188616e-01  7.20678210e-01
  1.36481166e+00  1.03954363e+00 -8.87658000e-01 -3.58765930e-01
  2.25492567e-01 -1.40121564e-01  3.74763697e-01  3.72640163e-01
 -3.70886594e-01  1.43452466e-01  1.87063619e-01  3.73812020e-01
  2.45278105e-02  7.15804458e-01  2.08226264e-01 -7.88335055e-02
  1.09342761e-01  4.04504910e-02 -1.51908442e-01  1.69594169e-01
 -2.14366004e-01  2.54302323e-01  5.72017014e-01 -1.70713454e-01
  9.83217478e-01  1.82541028e-01  4.93405871e-02  2.90113151e-01
  4.27542806e-01  2.49940112e-01  5.75569093e-01 -1.00440013e+00
 -6.55741477e-03  3.83511484e-01 -3.93022954e-01 -8.60261381e-01
  8.86497498e-02  8.57147872e-01  3.54601383e-01  1.39468402e-01
  4.40859571e-02 -2.38936488e-02 -1.64304823e-01 -1.13334646e-02
  6.69177294e-01  2.91543335e-01 -5.94084382e-01  3.86198699e-01
  3.10887396e-01 -2.42558107e-01 -5.38034379e-01  2.91182846e-01
  2.63026536e-01 -3.26587290e-01  1.04235142e-01 -8.63921046e-01
 -5.34380198e-01 -1.14627354e-01 -2.37659335e-01  1.97141677e-01
  2.90334016e-01 -7.67149627e-01  1.02382505e+00 -2.90567782e-02
 -1.13251843e-01 -8.53023410e-01  5.31504750e-01 -3.06502521e-01
  1.09228921e+00  3.68808992e-02  3.64569783e-01  4.93499607e-01
  1.74295652e+00 -1.58409178e-01 -6.99262083e-01  8.95977855e-01
 -9.54344451e-01  3.56937408e-01  2.94272691e-01  6.46877065e-02
  3.08370255e-02  2.90377855e-01 -1.36488214e-01 -4.44722474e-01
 -9.70212936e-01 -1.46897674e-01 -4.22361009e-02  1.86234653e-01
  1.89715832e-01 -2.25421071e-01  1.34442806e-01 -7.00153053e-01
 -1.10767150e+00  5.11312902e-01 -2.79642493e-01  3.73294502e-01
 -7.98319101e-01  6.97102308e-01  3.31392400e-02  3.93970519e-01
  3.23852785e-02 -4.09651041e-01  2.83638477e-01 -6.12399936e-01
 -8.87643099e-01 -1.31258905e-01 -7.33293518e-02  1.47918835e-01
 -5.56060851e-01  6.27143800e-01  1.08713126e+00  6.90663397e-01
 -9.57368553e-01  2.91721642e-01 -1.20110698e-01 -4.44054663e-01
  1.67493188e+00  3.85205328e-01  9.96679962e-01  4.28501189e-01
  2.78735846e-01  4.85258132e-01 -7.61325121e-01  3.37779760e-01
  7.46076941e-01  3.90042692e-01 -1.00670648e+00  7.01900423e-02
 -4.23207551e-01 -1.05447292e+00 -3.95866424e-01  3.47580582e-01
 -6.50140494e-02 -2.42056027e-02 -1.99006438e-01 -1.13033664e+00
  3.21459502e-01 -2.82380968e-01  5.68697117e-02 -8.81359041e-01
 -3.40585500e-01 -8.04002047e-01  6.06884420e-01  2.53899306e-01
  4.49809134e-02 -3.28419030e-01  3.32311392e-01  8.51836920e-01
 -3.92299667e-02 -1.13012552e+00  1.01311699e-01 -1.08288920e+00
 -4.87735271e-01  5.03100634e-01 -1.90314688e-02 -6.70698941e-01
  5.48938327e-02  5.86837888e-01 -3.37535530e-01  1.29277265e+00
  3.35333258e-01 -6.56011283e-01 -1.05446219e-01 -3.08665752e-01
  1.83946267e-02  2.50504524e-01 -7.76413381e-01 -1.14410710e+00
  1.10602409e-01 -1.82166576e-01 -5.20292044e-01  7.70471573e-01
  8.34978938e-01  3.13121140e-01  7.41455019e-01 -7.61738658e-01
  3.71798933e-01  8.69558990e-01 -6.91504717e-01 -2.50484526e-01
  1.17222778e-01  2.46799007e-01 -1.33719552e+00 -3.40068489e-01
  5.53660467e-02  4.74537909e-01  2.39139229e-01 -3.05983245e-01
  6.49068877e-02  6.16565168e-01 -1.77806169e-01 -4.70677853e-01
  8.06589305e-01 -7.21750140e-01 -1.28533199e-01  5.65737952e-03
 -3.27389628e-01 -5.58903217e-01  1.31876823e-02 -5.29128432e-01
 -7.53619552e-01 -8.94942600e-03  2.48634845e-01  6.67092443e-01
 -1.54286057e-01 -3.69620413e-01 -1.40491113e-01  2.77138948e-01
  1.46667495e-01  8.07485282e-01  3.20771009e-01  6.24951243e-01
  7.96172500e-01 -4.22803313e-01  6.73624456e-01  6.40589893e-01
  1.03424877e-01 -9.23184827e-02  2.96228021e-01 -6.78525746e-01
  1.41378179e-01  2.28882387e-01  4.72979695e-01  5.87298274e-01
 -3.29859972e-01  5.77787280e-01 -1.34650886e+00 -6.35464728e-01
  5.43554127e-02 -2.39345744e-01  9.14134458e-02  6.41911566e-01
 -2.62726754e-01 -1.22956280e-03  1.62090802e+00 -9.58541855e-02
  4.18504506e-01  3.93678457e-01 -6.62751853e-01  7.85538375e-01
  2.00214416e-01 -5.68046309e-02  1.21355439e-02  1.21692014e+00
  8.41036379e-01 -6.98036134e-01 -4.99443501e-01  4.58823927e-02
 -6.98619306e-01 -1.10945679e-01 -1.67328194e-01 -2.53589272e-01
 -7.95583785e-01  9.35731649e-01  1.06272686e+00 -5.37942469e-01
  4.47880238e-01  2.06888750e-01 -5.27596213e-02 -3.49151760e-01
 -3.43014210e-01 -6.26283824e-01  9.17382300e-01 -5.70841193e-01
  1.03133425e-01 -3.45254660e-01 -7.48005509e-02 -3.04669231e-01
 -8.65560591e-01  5.30050695e-01 -5.14221668e-01  3.48908275e-01
  1.19767272e+00 -4.88048315e-01 -4.00200486e-01  1.69440806e-01
 -1.02853417e+00 -4.96662766e-01  2.15767696e-02  5.26568830e-01
 -9.23858657e-02  6.25036597e-01 -6.62158787e-01  4.83825237e-01
  2.75194556e-01  1.54496089e-01 -2.15645373e-01  4.42594327e-02
 -3.59546512e-01 -1.90760121e-02 -8.50876570e-01  1.06945384e+00
  3.34871829e-01  2.68247008e-01  9.69766915e-01  2.73773223e-01
 -3.43078487e-02 -1.47778699e-02 -3.66333365e-01  4.11476165e-01

We have already got a pull request to solve the above two problems.
Thank you all for enhancing MMdnn!

Hi, thank you for your great effort. But I still encounter some errors.

First step:
python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d resnet50 --inputShape 3 112 112
I got a warning, but I think it is fine.
module/base_module.py:53: UserWarning: You created Module with Module(..., label_names=['softmax_label']) but input with name 'softmax_label' is not found in symbol.list_arguments(). Did you mean one of: data warnings.warn(msg)

Second step is fine, too. With the script:
python -m mmdnn.convers ion._script.IRToCode -f tensorflow --IRModelPath resnet50.pb --IRWeightPath resnet50.npy --dstModelPath tf_resnet50.py

However, in the third step with script:
python -m mmdnn.convers ion.examples.tensorflow.imagenet_test -s tensorflow -p resnet -n tf_resnet50 -w resnet50.npy, I got the following error:
Traceback (most recent call last): File "anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 75, in <module> tester.inference(tester.args.image) File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 50, in inference self.preprocess(image_path) File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 25, in preprocess x = super(TestTF, self).preprocess(image_path) File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/imagenet_test.py", line 265, in preprocess return func(image_path) File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/imagenet_test.py", line 93, in <lambda> 'resnet' : lambda path : TestKit.Standard(path, 299), File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/imagenet_test.py", line 244, in Standard img = image.load_img(path, target_size = (size, size)) File "anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/preprocessing/image.py", line 386, in load_img img = pil_image.open(path) File "anaconda3/lib/python3.6/site-packages/PIL/Image.py", line 2548, in open fp = builtins.open(filename, "rb") FileNotFoundError: [Errno 2] No such file or directory: 'mmdnn/conversion/examples/data/seagull.jpg'

I change the third script from
python -m mmdnn.convers ion.examples.tensorflow.imagenet_test -s tensorflow -p resnet -n tf_resnet50 -w resnet50.npy to

python -m mmdnn.convers ion.examples.tensorflow.imagenet_test -s tensorflow -p resnet -n tf_resnet50 -w resnet50.npy -i ~/Downloads/download.jpg

But got a new error:

Traceback (most recent call last):
  File "anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 75, in <module>
    tester.inference(tester.args.image)
  File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 54, in inference
    self.print_result()
  File "anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/tensorflow/imagenet_test.py", line 33, in print_result
    predict = sess.run(self.model, feed_dict = {self.input : self.data})
  File "anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1113, in _run
    str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 299, 299, 3) for Tensor 'data:0', which has shape '(?, 112, 112, 3)'

Besides, my generated tensorflow file is different from @JiahaoYao

@tengerye You are welcome, and I wonder if you can show me your generated tensorflow file.

Thank you for your kind reply @JiahaoYao .

My tensorflow file starts with

import tensorflow as tf

__weights_dict = dict()

is_train = False

def load_weights(weight_file):
    import numpy as np

    if weight_file == None:
        return

    try:
        weights_dict = np.load(weight_file).item()
    except:
        weights_dict = np.load(weight_file, encoding='bytes').item()

    return weights_dict


def KitModel(weight_file = None):
    global __weights_dict
    __weights_dict = load_weights(weight_file)

    data            = tf.placeholder(tf.float32, shape = (None, 112, 112, 3), name = 'data')

Do you need to read all the content?

@tengerye
Your first step converts the model from mxnet to IR, and second step converts IR structure to tensorflow. If you open your tensorflow snippet, it is already your model converted into tensorflow. Nevertheless, the third step aims to test the converted model in imagenet_test settings, as shown in the script's name mmdnn.conversion.examples.tensorflow.imagenet_test. Therefore, input should be 229x229x3 and output should be 1001 or 1000 if you choose -p resnet. LResnet, as shown in repo and Arxiv paper, takes in image of size 112x112x3 and outputs a 512-dim vector. I think that is why your third step did not work.

MMdnn aims at converting one framework to another framework. After your second step, you get the tensorflow snippet and parse file. Thus, the model is already converted.


First, try the newest version by

pip install -U git+https://github.com/Microsoft/MMdnn.git@master

Second, convert the model to intermediate representation format.

python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d resnet50 --inputShape 3 112 112

Third, convert to tensorflow code

python -m mmdnn.conversion._script.IRToCode -f tensorflow --IRModelPath resnet50.pb --IRWeightPath resnet50.npy --dstModelPath tf_resnet50.py

Then you already get tensorflow code tf_resnet50.py with parse file resnet50.pb.

If you would like to check whether the conversion is right, you can follow Sanity checks in my last post. Always remember to change the path of the model and image.

Hoping this address your problem~

It works. Thank you so much @JiahaoYao .

Great! It is my pleasure~

@JiahaoYao
Hi, I successfully followed steps to convert the model to tensorflow version. And I take a further step to convert it to a frozen model .pb file.
Then I tried to compare the model inference result using both original LResNet50E-IR model and the converted model, they are different, not even close.

I thought the comparison you did in previous post cannot prove the whole conversion process is correct.
It can ONLY prove that the conversions from IR model to both TF and MXNET are the same.

@JiahaoYao Sorry, I made a mistakes during the conversion.
The output embedding between original LResNet50E-IR model and the converted TF model are close.
Though there is still some precision loss during conversion.
Here is a comparison result on one image (only first 10 floats in output embedding):

LResNet50E-IR:
[-0.00493836 0.04874415 -0.04708482 0.05242502 0.0405311 0.09130251
-0.0123271 -0.01178447 -0.0353515 0.10181216]

Converted Tensorflow Model:
[-0.00632004 0.04413364 -0.04881488 0.0524389 0.04239314 0.09475758
-0.01080625 -0.00725455 -0.03120478 0.10712451]

@sczhengyabin
how did you convert them into frozen model .pb in tensorflow?

@xmuszq Just use the tensorflow freeze_graph tool.

I did a test on lfw, the accuracy is 55%.

@xmuszq Could you share the scripts that convert the tensorflow code snippet and .npy file into .ckpt or .pb files with me ? Thank u very much!

Hi,

I'm still getting an error when running the first command. Any help would be greatly appreciated !

Platform: Ununtu 16.04

Python version: 3.6

Source framework with version: MXNet 1.0.0 with GPU

Pre-trained : LResNet50E-IR, same as everyone

Running scripts:

python -m mmdnn.conversion._script.convertToIR -f mxnet -n model-symbol.json -w model-0000.params -d model-descr512 --inputShape 3 112 112

Error message :

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 197, in <module>
    _main()
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 192, in _main
    ret = _convert(args)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 115, in _convert
    parser.run(args.dstPath)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/common/DataStructure/parser.py", line 22, in run
    self.gen_IR()
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 262, in gen_IR
    func(current_node)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 424, in rename_Convolution
    self.set_output_shape(source_node, IR_node)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mmdnn/conversion/mxnet/mxnet_parser.py", line 281, in set_output_shape
    arg_shape, output_shape, aux_shape = sym.infer_shape(data = self.data_shape)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 989, in infer_shape
    res = self._infer_shape_impl(False, *args, **kwargs)
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1119, in _infer_shape_impl
    ctypes.byref(complete)))
  File "/home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/base.py", line 146, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator conv0: [12:09:01] src/operator/nn/./convolution-inl.h:492: Check failed: dshp.ndim() == 4U (2 vs. 4) Input data should be 4D in batch-num_filter-y-x

Stack trace returned 10 entries:
[bt] (0) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x192112) [0x7f238c341112]
[bt] (1) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x192738) [0x7f238c341738]
[bt] (2) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x346f40) [0x7f238c4f5f40]
[bt] (3) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x24e2b37) [0x7f238e691b37]
[bt] (4) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2358e7f) [0x7f238e507e7f]
[bt] (5) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x235b96f) [0x7f238e50a96f]
[bt] (6) /home/stanislas_bertrand/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(MXSymbolInferShape+0x1539) [0x7f238e490099]
[bt] (7) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f23a62c1e40]
[bt] (8) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7f23a62c18ab]
[bt] (9) /usr/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2cf) [0x7f23a64d595f]

Hi @StanislasBertrand , Please refer to this issue

@JiahaoYao your conversion and inference code works great in python. However, do you have any sample code for performing inference with the converted model in C++. I've tried the session API but haven't been able to make it work with the model.

Figured out how to do it. For those wondering, when loading the model with the python code above, you can make the following modification to save the model:

# inference with tensorflow
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    predict = sess.run(model_tf, feed_dict = {input_tf : input_data})
   
   # Add the following line to save the model
    tf.train.Saver(tf.trainable_variables()).save(sess, 'tensorflow_models/my-model')

You can then use the C++ session API