mindspore-lab / mindocr

A toolbox of ocr models and algorithms based on MindSpore

Home Page:https://mindspore-lab.github.io/mindocr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: ({'errCode': 'E80003', 'op_name': 'gather_nd', 'param_name': 'kernel_name', 'param_type': 'str', 'actual_type': 'bool'}, "In op[gather_nd], the parameter[kernel_name]'s type should be [str], but actually is [bool].")

qq452655434 opened this issue · comments

commented

使用mindocr中支持的resnet50跑文本检测出现如下报错:

——————————————————————————————————————————————————————
[WARNING] KERNEL(46,ffffaab5be50,python):2023-09-11-16:40:01.069.992 [mindspore/ccsrc/plugin/device/ascend/kernel/tbe/tbe_kernel_compile.cc:644] UpdateFusionTypeAndOutputDataDesc] gather_nd_10809780653405783111_0 not in prebuild_res_map_. Op name: Default/network-TrainOneStepWrapper/network-NetWithLossWrapper/_loss_fn-DBLoss/bce_loss-BalancedBCELoss/GatherNd-op1413
Traceback (most recent call last):
File "/data/data/mindocr/./tools/train.py", line 312, in
main(config)
File "/data/data/mindocr/./tools/train.py", line 243, in main
model.train(
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 1044, in train
self._train(epoch,
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 100, in wrapper
func(self, *args, **kwargs)
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 597, in _train
self._train_dataset_sink_process(epoch, train_dataset, list_callback,
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 681, in _train_dataset_sink_process
outputs = train_network(*inputs)
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 620, in call
out = self.compile_and_run(*args, **kwargs)
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 939, in compile_and_run
self.compile(*args, **kwargs)
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 916, in compile
_cell_graph_executor.compile(self, phase=self.phase,
File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/common/api.py", line 1388, in compile
result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode())
RuntimeError: TBE Single op compile failed. Compile failed op number:1, failed log:op: gather_nd_10809780653405783111_0.


  • Operator Compilation Exception Message: (For framework developers)

2023-09-11 08:40:20.953706+00:00: Query except_msg:Traceback (most recent call last):
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/parallel_compilation.py", line 1627, in run
json_file_path = build_single_op(self._op_module, self._op_func,
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1543, in build_single_op
json_file_path = call_op()
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1526, in call_op
json_file_path = build_static_op(op_info, new_attrs, caxis_valus)
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1454, in build_static_op
opfunc(*inputs, *outputs, *new_attrs, **kwargs)
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/register/operation_func_mgr.py", line 197, in wrapper
return func(*args, **kwargs)
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 536, in _in_wrapper
_check_one_op_param(one_args, formal_parameter_list[i][0],
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 522, in _check_one_op_param
_check_kernel_name(op_param, param_name, op_name)
File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 483, in _check_kernel_name
raise RuntimeError(
RuntimeError: ({'errCode': 'E80003', 'op_name': 'gather_nd', 'param_name': 'kernel_name', 'param_type': 'str', 'actual_type': 'bool'}, "In op[gather_nd], the parameter[kernel_name]'s type should be [str], but actually is [bool].")
——————————————————————————————————————————————————————————

运行代码:

python ./tools/train.py --config ./configs/det/dbnet/db++_r50_icdar15.yaml

运行环境:

mindspore: 2.0.0
cann:6.3.RC1
driver:23.0.RC1

您好。问题已收到,研发同事正在尝试复现您所提交的issue,确认后会给您进一步答复。
建议您先尝试按照MindSpore官网教程,重新安装配置指定版本的CANN。因为MindSpore与CANN存在版本关联,如版本失配,可能会在运行特定网络时,出现上述错误(此时mindspore.run_check()结果显示正常)。

使用mindocr中支持的resnet50跑文本检测出现如下报错:

—————————————————————————————————————————————————————— [WARNING] KERNEL(46,ffffaab5be50,python):2023-09-11-16:40:01.069.992 [mindspore/ccsrc/plugin/device/ascend/kernel/tbe/tbe_kernel_compile.cc:644] UpdateFusionTypeAndOutputDataDesc] gather_nd_10809780653405783111_0 not in prebuild_res_map_. Op name: Default/network-TrainOneStepWrapper/network-NetWithLossWrapper/_loss_fn-DBLoss/bce_loss-BalancedBCELoss/GatherNd-op1413 Traceback (most recent call last): File "/data/data/mindocr/./tools/train.py", line 312, in main(config) File "/data/data/mindocr/./tools/train.py", line 243, in main model.train( File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 1044, in train self._train(epoch, File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 100, in wrapper func(self, *args, **kwargs) File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 597, in _train self._train_dataset_sink_process(epoch, train_dataset, list_callback, File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/train/model.py", line 681, in _train_dataset_sink_process outputs = train_network(*inputs) File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 620, in call out = self.compile_and_run(*args, **kwargs) File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 939, in compile_and_run self.compile(*args, **kwargs) File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/nn/cell.py", line 916, in compile _cell_graph_executor.compile(self, phase=self.phase, File "/opt/gde/python/python-3.9.11/lib/python3.9/site-packages/mindspore/common/api.py", line 1388, in compile result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode()) RuntimeError: TBE Single op compile failed. Compile failed op number:1, failed log:op: gather_nd_10809780653405783111_0.

  • Operator Compilation Exception Message: (For framework developers)

2023-09-11 08:40:20.953706+00:00: Query except_msg:Traceback (most recent call last): File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/parallel_compilation.py", line 1627, in run json_file_path = build_single_op(self._op_module, self._op_func, File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1543, in build_single_op json_file_path = call_op() File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1526, in call_op json_file_path = build_static_op(op_info, new_attrs, caxis_valus) File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/te_fusion/fusion_manager.py", line 1454, in build_static_op opfunc(*inputs, *outputs, *new_attrs, **kwargs) File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/register/operation_func_mgr.py", line 197, in wrapper return func(*args, **kwargs) File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 536, in _in_wrapper _check_one_op_param(one_args, formal_parameter_list[i][0], File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 522, in _check_one_op_param _check_kernel_name(op_param, param_name, op_name) File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/utils/para_check.py", line 483, in _check_kernel_name raise RuntimeError( RuntimeError: ({'errCode': 'E80003', 'op_name': 'gather_nd', 'param_name': 'kernel_name', 'param_type': 'str', 'actual_type': 'bool'}, "In op[gather_nd], the parameter[kernel_name]'s type should be [str], but actually is [bool].") ——————————————————————————————————————————————————————————

运行代码:

python ./tools/train.py --config ./configs/det/dbnet/db++_r50_icdar15.yaml

运行环境:

mindspore: 2.0.0 cann:6.3.RC1 driver:23.0.RC1

您好,我们尝试了在MindSpore 2.0.0环境下执行python ./tools/train.py --config ./configs/det/dbnet/db++_r50_icdar15.yaml,暂未能复现您所提交的问题。您可以参考以下指令,校验cann和driver版本:

$ cat /usr/local/Ascend/driver/version.info
Version=23.0.rc1
ascendhal_version=6.24.1
aicpu_version=1.0
tdt_version=1.0
log_version=1.0
prof_version=2.0
dvppkernels_version=1.1
tsfw_version=1.0
Innerversion=V100R001C29SPC001B249
compatible_version=[V100R001C29],[V100R001C84]
package_version=23.0.rc1

$ cat /usr/local/Ascend/ascend-toolkit/latest/version.cfg

# version: 1.0
runtime_running_version=[6.3.0.1.241:6.3.RC1]
compiler_running_version=[6.3.0.1.241:6.3.RC1]
opp_running_version=[6.3.0.1.241:6.3.RC1]
toolkit_running_version=[6.3.0.1.241:6.3.RC1]
aoe_running_version=[6.3.0.1.241:6.3.RC1]
ncs_running_version=[6.3.0.1.241:6.3.RC1]
runtime_upgrade_version=[6.3.0.1.241:6.3.RC1]
compiler_upgrade_version=[6.3.0.1.241:6.3.RC1]
opp_upgrade_version=[6.3.0.1.241:6.3.RC1]
toolkit_upgrade_version=[6.3.0.1.241:6.3.RC1]
aoe_upgrade_version=[6.3.0.1.241:6.3.RC1]
ncs_upgrade_version=[6.3.0.1.241:6.3.RC1]
runtime_installed_version=[6.3.0.1.241:6.3.RC1]
compiler_installed_version=[6.3.0.1.241:6.3.RC1]
opp_installed_version=[6.3.0.1.241:6.3.RC1]
toolkit_installed_version=[6.3.0.1.241:6.3.RC1]
aoe_installed_version=[6.3.0.1.241:6.3.RC1]
ncs_installed_version=[6.3.0.1.241:6.3.RC1]
commented
[root@manas-train-node01 start]# cat /usr/local/Ascend/driver/version.info
Version=23.0.rc1
ascendhal_version=6.24.1
aicpu_version=1.0
tdt_version=1.0
log_version=1.0
prof_version=2.0
dvppkernels_version=1.1
tsfw_version=1.0
Innerversion=V100R001C29SPC001B249
compatible_version=[V100R001C29],[V100R001C84]
package_version=23.0.rc1

[root@manas-train-node01 start]# cat /usr/local/Ascend/ascend-toolkit/latest/version.cfg
# version: 1.0
runtime_running_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
compiler_running_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
opp_running_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
toolkit_running_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
aoe_running_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
runtime_upgrade_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
compiler_upgrade_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
opp_upgrade_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
toolkit_upgrade_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
aoe_upgrade_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
runtime_installed_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
compiler_installed_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
opp_installed_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
toolkit_installed_version=[6.3.T2.0.B107:6.3.RC1.alpha001]
aoe_installed_version=[6.3.T2.0.B107:6.3.RC1.alpha001]

看上去是cann版本的问题
但是在shell中运行官网参考实例的另一个文本识别的脚本可以跑通

python tools/train.py --config configs/rec/crnn/crnn_icdar15.yaml
commented

重新安装了cann版本后问题解决了