microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Group convolution in Keras] ResNeXt mxnet -> IR -> keras

kamikawa opened this issue · comments

Hi
Thank you for a great covert tool.

I am trying to convert from mxnet resnext to keras.
symbol file: http://data.mxnet.io/models/imagenet/resnext/101-layers/resnext-101-64x4d-symbol.json
param file: http://data.mxnet.io/models/imagenet/resnext/101-layers/resnext-101-64x4d-0000.params

I could convert from mxnet to IR with no error,

python -m mmdnn.conversion._script.convertToIR -f mxnet -n resnext-101-64x4d-symbol.json -w resnext-101-64x4d-0000.params -d resnext-101-64x4d --inputShape 3 224 224

but failed to convert from IR to keras with an error below.
Would you support this model?

Regards,


python -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath resnext-101-64x4d.pb --dstModelPath keras_resnext-101-64x4d.py

Parse file [resnext-101-64x4d.pb] with binary format successfully.
Traceback (most recent call last):
File "C:\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 120, in
_main()
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 115, in _main
ret = _convert(args)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion_script\IRToCode.py", line 56, in _convert
emitter.run(args.dstModelPath, args.dstWeightPath, args.phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\DataStructure\emitter.py", line 21, in run
self.save_code(dstNetworkPath, phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\DataStructure\emitter.py", line 53, in save_code
code = self.gen_code(phase)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 95, in gen_code
func(current_node)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 194, in emit_Conv
return self._emit_convolution(IR_node, 'layers.Conv{}D'.format(dim))
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 179, in _emit_convolution
input_node, padding = self._defuse_padding(IR_node)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 160, in _defuse_padding
padding = self._convert_padding(padding)
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\keras\keras2_emitter.py", line 139, in _convert_padding
padding = convert_onnx_pad_to_tf(padding)[1:-1]
File "C:\Anaconda3\lib\site-packages\mmdnn\conversion\common\utils.py", line 62, in convert_onnx_pad_to_tf
return np.transpose(np.array(pads).reshape([2, -1])).reshape(-1, 2).tolist()

ValueError: cannot reshape array of size 1 into shape (2,newaxis)

Hi @kamikawa, fixed the issue you met. But there is no offical "group conv" support in keras. Any help to fill this group_conv part like this pull request are welcome.

Thank you for an answer, I understand the situation.
For official group conv support in keras, backend support may also be needed...

We can split the input into groups, apply the kernel for each one, and concat them after all, to work around it.

Hi @kamikawa . The keras group convolution is implemented by @skybigzhou and tested mxnet resnext->keras conversion. Please try the newest code. Thanks both.

Hi @kitstar , thank you for handling this issue, @skybigzhou , thank you for the fixed code.
I tried the fixed version.
resnext IR architecture file could be converted fine, but I failed to convert IR weights file (hang ups) .
It might be my environment specific problem, or weights file convert also needs work around...
But thanks, anyway.


$ python -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath resnext-101-64x4d.pb --dstModelPath keras_resnext-101-64x4d.py
Parse file [resnext-101-64x4d.pb] with binary format successfully.
Target network code snippet is saved as [keras_resnext-101-64x4d.py].

$ python -m mmdnn.conversion.examples.keras.imagenet_test -n keras_resnext-101-64x4d.py -w resnext-101-64x4d.npy --dump keras_resnext-101-64x4d.h5
/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-01-26 23:16:15.297039: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-01-26 23:16:15.297483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
2018-01-26 23:16:15.667115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-01-26 23:16:15.667190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0
2018-01-26 23:16:15.667211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y
2018-01-26 23:16:15.667226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)

It is not hanged up.... It took 30 minutes in my machine to finish the conversion (keras model.save() function). Maybe it needs to copy weights for each conv group since it is a workaround solution. You can put it background and wait for ... 1hour?

Oh, sorry. I think I waited about 10 minutes. I will try again next week, and let you know.

Hi @kitstar , I tried the resnext-101-64x64d IR weight conversion (took about 70 min.) , but failed to save to hdf5 by the limitation of hdf5 object header size. I could convert resnext-50 cases.
I am wondering why you succeeded, but hdf5 64KB object header limit may be the reason for the failure.
If so, the solution is discussed below and we might want to wait for it.
keras-team/keras#7508


Traceback (most recent call last):
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/mmdnn-0.1.2-py3.6.egg/mmdnn/conversion/examples/keras/imagenet_test.py", line 58, in
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/mmdnn-0.1.2-py3.6.egg/mmdnn/conversion/examples/keras/imagenet_test.py", line 50, in dump
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/engine/topology.py", line 2573, in save
save_model(self, filepath, overwrite, include_optimizer)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/models.py", line 119, in save_model
topology.save_weights_to_hdf5_group(model_weights_group, model_layers)
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/keras/engine/topology.py", line 2869, in save_weights_to_hdf5_group
f.attrs['layer_names'] = [layer.name.encode('utf8') for layer in layers]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 95, in setitem
self.create(name, data=value, dtype=base.guess_dtype(value))
File "/home/kamikawa/anaconda3/envs/mmdnn_test/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 188, in create
attr = h5a.create(self._id, self._e(tempname), htype, space)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 47, in h5py.h5a.create
RuntimeError: Unable to create attribute (object header message is too large)


Another problem.
KitModel(weight_file=None) function of keras resnext-50 model I converted does not work with an error below.
Regards.


Traceback (most recent call last):
File "resnext_test.py", line 169, in
main()
File "resnext_test.py", line 93, in main
model = KitModel(weight_file=None)
File "/home/kamikawa/test/keras_resnext50.py", line 59, in KitModel
stage1_unit1_conv2 = convolution(weights_dict, name='stage1_unit1_conv2', input=stage1_unit1_conv2_input, group=32, conv_type='layers.Conv2D', filters=128, kernel_size=(3, 3), strides=(1, 1), dilation_rate=(1, 1), padding='valid', use_bias=False)
File "/home/kamikawa/test/keras_resnext50.py", line 266, in convolution
weights_dict[name + "_" + str(c)] = dict()
TypeError: 'NoneType' object does not support item assignment

Hi @kamikawa

For resnext-50, we tested with below steps:

  1. Download MXNet resnext-50 model
$ python3 -m mmdnn.conversion.examples.mxnet.extract_model -n imagenet1k-resnext-50 -i mmdnn/conversion/examples/data/seagull.jpg

Downloading file [./resnext-50-symbol.json] from [http://data.mxnet.io/models/imagenet/resnext/50-layers/resnext-50-symbol.json]
100% [..............................................................................] 79195 / 79195Downloading file [./resnext-50-0000.params] from [http://data.mxnet.io/models/imagenet/resnext/50-layers/resnext-50-0000.params]
100% [......................................................................] 100404458 / 100404458Model imagenet1k-resnext-50 saved.
[11:13:01] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[11:13:01] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
[(396, 0.7104751), (398, 0.122665755), (438, 0.06391319), (440, 0.029796895), (417, 0.019492012)]
  1. MXNet -> IR
$ python3 -m mmdnn.conversion._script.convertToIR -f mxnet -n examples/mxnet/models/resnext-50-symbol.json -w examples/mxnet/models/resnext-50-0000.params -d mxnet_resnext50 --inputShape 3 224 224

IR network structure is saved as [mxnet_resnext50.json].
IR network structure is saved as [mxnet_resnext50.pb].
IR weights are saved as [mxnet_resnext50.npy].
  1. IR -> Keras
$ python3 -m mmdnn.conversion._script.IRToCode -f keras --IRModelPath mxnet_resnext50.pb --dstModelPath converted_resnext50.py --IRWeightPath mxnet_resnext50.npy

Parse file [mxnet_resnext50.pb] with binary format successfully.
Target network code snippet is saved as [converted_resnext50.py].
  1. Test Keras Result
$ python3 -m mmdnn.conversion.examples.keras.imagenet_test -p imagenet1k-resnext-50 -s mxnet -n converted_resnext50 -w mxnet_resnext50.npy -i mmdnn/conversion/examples/data/seagull.jpg

[(396, 0.71047646), (398, 0.12266552), (438, 0.063913494), (440, 0.02979695), (417, 0.01949201)]
Test model [imagenet1k-resnext-50] from [mxnet] passed.
  1. Save as Keras model
$ python3 -m mmdnn.conversion.examples.keras.imagenet_test -n converted_resnext50 -w mxnet_resnext50.npy --dump mxnet_resnext50.h5

Keras model file is saved as [mxnet_resnext50.h5], generated by [converted_resnext50.py] and [mxnet_resnext50.npy]

Converted model inference result matches the original inference result. Test passed.


For resnext-101, we tested only without last step and result is ok like resnext-50. Seems we can wait for Keras update to fix this problem.

Thanks.


You can download the resnext-50 related files(Including final h5 file and IR files) from here.

Hi @kitstar , thank you for the detailed procedure.
I did not notice IR weights npy file can be used instead of hdf5 keras weights file.
This is a good specification (though it is slow to load).
Your test sequence command seems set IR weights filename for an argument of KitModel function.
So I think I figure out the situation, and below is summary of error condition I get.

model = KitModel(weight_file=None) # error when resnext , OK when resnet
model = KitModel(weight_file=IR_weights_filename) # OK resnext or resnet. The test mode is run with this setting.

It seems that the converted resnext Kitmodel function does not support when argument is set "None"

Regards.

Glad it help! Thank you for using it and issue submission! Any other idea is welcome!

The limitation of Keras hdf5 object header size seems to be solved
keras-team/keras#7508
I close this issue.
Tensorflow (a backend of Keras) official support of group conv is being discussed here.
tensorflow/tensorflow#3332
When keras supports it , I hope MMdnn convertor will also support it.
Thank you.

Hi @kitstar I followed ur step, but got an error in the final step saving as Keras model, the error is as following:

Traceback (most recent call last):
  File "/home/imsight/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/imsight/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/imsight/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/keras/imagenet_test.py", line 161, in <module>
    tester = TestKeras()
  File "/home/imsight/anaconda3/lib/python3.6/site-packages/mmdnn/conversion/examples/keras/imagenet_test.py", line 31, in __init__
    self.model = self.MainModel.KitModel(self.args.w)
  File "converted_resnext50.py", line 247, in KitModel
    set_layer_weights(model, weights_dict)
  File "converted_resnext50.py", line 27, in set_layer_weights
    current_layer_parameters.extend([cur_dict['mean'], cur_dict['var']])
KeyError: 'mean'

Do you know why this happened? Any suggestions?