onnxruntime INVALID_PROTOBUF Load model from / failed Protobuf parsing failed[Bug]

Question

onnxruntime INVALID_PROTOBUF Load model from / failed Protobuf parsing failed[Bug]

YuantianGao opened this issue 5 months ago · comments

Checklist

I have searched related issues but cannot get the expected help.
2. I have read the FAQ documentation but cannot get the expected help.
3. The bug has not been fixed in the latest version.

Describe the bug

我使用mmpretrain 训练里一个mobilenetv3模型使用mmdeploy 做了模型的转换可是使用官网推理的例子时候报了如下错误：
onnxruntime.capi.onnxruntime_pybindll_state.InvalidProtobuf: [ONNXRuntimeError]: 7 : Load model from /failed: Protobuf paring failed.
训练和转换的流程都很顺利。

Reproduction

mmpretrain 生成的配置代码

dalei_kind = 'dalei_02'
data_preprocessor = dict(
    mean=[
        0.485,
        0.456,
        0.406,
    ],
    num_classes=11,
    std=[
        0.229,
        0.224,
        0.225,
    ],
    to_rgb=True)
data_root = '/openmm_mount/data/'
dataset_type = 'CustomDataset'
default_hooks = dict(
    checkpoint=dict(interval=1, type='CheckpointHook'),
    logger=dict(interval=100, type='LoggerHook'),
    param_scheduler=dict(type='ParamSchedulerHook'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    timer=dict(type='IterTimerHook'),
    visualization=dict(enable=False, type='VisualizationHook'))
default_scope = 'mmpretrain'
env_cfg = dict(
    cudnn_benchmark=False,
    dist_cfg=dict(backend='nccl'),
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
launcher = 'none'
load_from = None
log_level = 'INFO'
model = dict(
    backbone=dict(arch='small', type='MobileNetV3'),
    head=dict(
        act_cfg=dict(type='HSwish'),
        dropout_rate=0.2,
        in_channels=576,
        init_cfg=dict(
            bias=0.0, layer='Linear', mean=0.0, std=0.01, type='Normal'),
        loss=dict(loss_weight=1.0, type='CrossEntropyLoss'),
        mid_channels=[
            1024,
        ],
        num_classes=11,
        topk=(
            1,
            5,
        ),
        type='StackedLinearClsHead'),
    neck=dict(type='GlobalAveragePooling'),
    type='ImageClassifier')
optim_wrapper = dict(
    optimizer=dict(lr=0.1, momentum=0.9, type='SGD', weight_decay=0.0001))
param_scheduler = dict(
    by_epoch=True, gamma=0.1, milestones=[
        30,
        60,
        90,
    ], type='MultiStepLR')
randomness = dict(deterministic=False, seed=None)
resume = False
test_cfg = dict()
test_dataloader = dict(
    batch_size=64,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        data_root='/openmm_mount/data/dalei_02/val',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(scale=(
                256,
                256,
            ), type='Resize'),
            dict(type='PackInputs'),
        ],
        type='CustomDataset'),
    num_workers=2,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(topk=(1, ), type='Accuracy')
vis_backends = [
    dict(type='LocalVisBackend'),
]
visualizer = dict(
    type='UniversalVisualizer', vis_backends=[
        dict(type='LocalVisBackend'),
    ])
work_dir = './work_dirs/dalei_02'

Environment

onnxruntime-gpu 1.15.1
onnx 1.15.0

Error traceback

No response

Yuantian Gao · Answer 1 · Fri Dec 29 2023 15:21:38 GMT+0800 (China Standard Time)

mmdeploy转换配置即 classification_onnxruntime_static.py
推理代码即为官方代码其中配置文件为上述的mmpretrain训练生成的配置文件,
推理代码如下
`from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch

deploy_cfg = 'configs/mmpretrain/classification_onnxruntime_dynamic.py'
model_cfg = './resnet18_8xb32_in1k.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmpretrain/ort/end2end.onnx']
image = 'tests/data/tiger.jpeg'

read deploy_cfg and model_cfg

deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)

build task and backend model

task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)

process input image

input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)

do model inference

with torch.no_grad():
result = model.test_step(model_inputs)

visualize results

task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_classification.png')`