mindspore-lab / mindocr

A toolbox of OCR models, algorithms, and pipelines based on MindSpore

Home Page:https://mindspore-lab.github.io/mindocr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

训练时评估验证集报错:

GZ-Metal-Cell opened this issue · comments

如题,使用 v0.3.1 的 MindOCR,使用 totaltext 训练 r18 的 DBNet,会在 1~2 个 epoch 的时候报错:

Traceback (most recent call last):
  File "/home/ma-user/work/mindocr/tools/train.py", line 318, in <module>
    main(config)
  File "/home/ma-user/work/mindocr/tools/train.py", line 249, in main
    model.train(
  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/train/model.py", line 1061, in train
    self._train(epoch,
  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/train/model.py", line 113, in wrapper
    func(self, *args, **kwargs)
  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/train/model.py", line 619, in _train
    self._train_dataset_sink_process(epoch, train_dataset, list_callback,
  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/train/model.py", line 731, in _train_dataset_sink_process
    list_callback.on_train_epoch_end(run_context)
  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/train/callback/_callback.py", line 402, in on_train_epoch_end
    cb.on_train_epoch_end(run_context)
  File "/home/ma-user/work/mindocr/mindocr/utils/callbacks.py", line 193, in on_train_epoch_end
    measures = self.net_evaluator.eval()
  File "/home/ma-user/work/mindocr/mindocr/utils/evaluator.py", line 152, in eval
    preds = self.postprocessor(preds, **data_info)
  File "/home/ma-user/work/mindocr/mindocr/postprocess/det_base_postprocess.py", line 102, in __call__
    result = self._postprocess(pred, **kwargs)
  File "/home/ma-user/work/mindocr/mindocr/postprocess/det_db_postprocess.py", line 82, in _postprocess
    sample_polys, sample_scores = self._extract_preds(pr, segm)
  File "/home/ma-user/work/mindocr/mindocr/postprocess/det_db_postprocess.py", line 113, in _extract_preds
    poly = np.array(expand_poly(points, distance=poly.area * self._expand_ratio / poly.length))
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
[WARNING] ME(3002438:281461320040800,MainProcess):2024-01-05-15:35:29.222.905 [mindspore/dataset/engine/datasets_user_defined.py:264] Generator receives a termination signal, stop waiting for data from subprocess.
[WARNING] MD(3002438,ffff8a45f010,python):2024-01-05-15:35:34.445.975 [mindspore/ccsrc/minddata/dataset/engine/datasetops/data_queue_op.cc:115] ~DataQueueOp] 
preprocess_batch: 1659;
batch_queue: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0;
            push_start_time -> push_end_time
2024-01-05-15:35:14.274.416 -> 2024-01-05-15:35:14.322.522
2024-01-05-15:35:15.497.472 -> 2024-01-05-15:35:15.556.873
2024-01-05-15:35:16.658.987 -> 2024-01-05-15:35:16.702.952
2024-01-05-15:35:16.901.313 -> 2024-01-05-15:35:16.946.270
2024-01-05-15:35:17.129.239 -> 2024-01-05-15:35:17.171.914
2024-01-05-15:35:17.523.003 -> 2024-01-05-15:35:17.557.942
2024-01-05-15:35:17.659.011 -> 2024-01-05-15:35:17.695.488
2024-01-05-15:35:17.824.658 -> 2024-01-05-15:35:17.857.248
2024-01-05-15:35:18.099.311 -> 2024-01-05-15:35:18.137.962
2024-01-05-15:35:19.059.404 -> 2024-01-05-15:35:19.089.938
For more details, please refer to the FAQ at https://www.mindspore.cn/docs/en/master/faq/data_processing.html.

所用环境如下:

(MindSpore) [ma-user ~]$conda list
/home/ma-user/anaconda3/lib/python3.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
# packages in environment at /home/ma-user/anaconda3/envs/MindSpore:
#
# Name                    Version                   Build  Channel
_openmp_mutex             4.5                       2_gnu    conda-forge
absl-py                   0.13.0                    <pip>
addict                    2.4.0                     <pip>
albumentations            0.4.5                     <pip>
APScheduler               3.8.1                     <pip>
arrow                     1.2.3                     <pip>
asgiref                   3.5.2                     <pip>
astroid                   2.11.7                    <pip>
asttokens                 2.0.8                     <pip>
astunparse                1.6.3                     <pip>
attrs                     19.3.0                    <pip>
backcall                  0.2.0                     <pip>
backports.zoneinfo        0.2.1                     <pip>
binaryornot               0.4.4                     <pip>
boto3                     1.12.22                   <pip>
botocore                  1.15.49                   <pip>
bzip2                     1.0.8                hfd63f10_2    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
ca-certificates           2022.6.15            h4fd8a4c_0    conda-forge
certifi                   2022.9.24                 <pip>
cffi                      1.14.0                    <pip>
chardet                   3.0.4                     <pip>
charset-normalizer        2.0.12                    <pip>
click                     8.1.3                     <pip>
cloudpickle               1.3.0                     <pip>
colorama                  0.4.4                     <pip>
configparser              5.2.0                     <pip>
cookiecutter              2.1.1                     <pip>
coverage                  6.4.3                     <pip>
cryptography              3.4.7                     <pip>
cycler                    0.11.0                    <pip>
Cython                    3.0.2                     <pip>
dask                      2.18.1                    <pip>
debugpy                   1.6.3                     <pip>
decorator                 4.4.1                     <pip>
defusedxml                0.7.1                     <pip>
dill                      0.3.5.1                   <pip>
Django                    3.2.16                    <pip>
docutils                  0.15.2                    <pip>
easydict                  1.9                       <pip>
entrypoints               0.4                       <pip>
ephemeral-port-reserve    1.1.4                     <pip>
esdk-obs-python           3.20.1                    <pip>
et-xmlfile                1.1.0                     <pip>
Flask                     2.1.0                     <pip>
fonttools                 4.37.4                    <pip>
freetype-py               2.3.0                     <pip>
future                    0.18.2                    <pip>
futures                   3.1.1                     <pip>
gast                      0.3.2                     <pip>
gnureadline               8.1.2                     <pip>
google-pasta              0.2.0                     <pip>
grpcio                    1.60.0                    <pip>
grpcio-tools              1.26.0                    <pip>
gunicorn                  20.1.0                    <pip>
h5py                      3.9.0                     <pip>
idna                      2.10                      <pip>
image                     1.5.28                    <pip>
imageio                   2.9.0                     <pip>
imgaug                    0.2.6                     <pip>
importlib-metadata        5.0.0                     <pip>
iniconfig                 1.1.1                     <pip>
ipyfilechooser            0.6.0                     <pip>
ipykernel                 6.7.0                     <pip>
ipython                   7.34.0                    <pip>
ipython-genutils          0.2.0                     <pip>
ipywidgets                8.0.4                     <pip>
isort                     5.10.1                    <pip>
itsdangerous              2.1.2                     <pip>
jdcal                     1.4.1                     <pip>
jedi                      0.18.1                    <pip>
Jinja2                    3.0.1                     <pip>
jinja2-time               0.2.0                     <pip>
jmespath                  0.10.0                    <pip>
joblib                    1.3.2                     <pip>
jupyter-client            7.3.4                     <pip>
jupyter-core              4.11.1                    <pip>
jupyterlab-widgets        3.0.5                     <pip>
Keras                     2.3.1                     <pip>
Keras-Applications        1.0.8                     <pip>
Keras-Preprocessing       1.1.2                     <pip>
keyboard                  0.13.5                    <pip>
kfac                      0.2.0                     <pip>
kiwisolver                1.1.0                     <pip>
lanms                     1.0.2                     <pip>
lazy-import               0.2.2                     <pip>
lazy-object-proxy         1.7.1                     <pip>
ld_impl_linux-aarch64     2.36.1               h02ad14f_2    conda-forge
libcst                    0.4.7                     <pip>
libffi                    3.4.2                h3557bc0_5    conda-forge
libgcc-ng                 12.1.0              h3242a24_16    conda-forge
libgomp                   12.1.0              h3242a24_16    conda-forge
libnsl                    2.0.0                hf897c2e_0    conda-forge
libstdcxx-ng              12.1.0              hd01590b_16    conda-forge
libuuid                   2.32.1            hf897c2e_1000    conda-forge
libzlib                   1.2.12               h4e544f5_1    conda-forge
lmdb                      1.4.1                     <pip>
lxml                      4.9.3                     <pip>
MarkupSafe                2.1.1                     <pip>
marshmallow               3.18.0                    <pip>
matplotlib                3.5.1                     <pip>
matplotlib-inline         0.1.3                     <pip>
mccabe                    0.7.0                     <pip>
mindarmour                1.9.0                     <pip>
mindformers               0.3.0                     <pip>
mindinsight               1.9.0                     <pip>
mindocr                   0.2.0                     <pip>
mindspore                 2.1.1                     <pip>
mmcv                      2.0.1                     <pip>
moxing-framework          2.0.1.rc0.ffd1c0c8           <pip>
mpmath                    1.2.1                     <pip>
mypy-extensions           0.4.3                     <pip>
ncurses                   6.3                  headf329_1    conda-forge
nest-asyncio              1.5.5                     <pip>
networkx                  2.6.3                     <pip>
ninja                     1.10.2.3                  <pip>
numba                     0.47.0                    <pip>
numexpr                   2.8.6                     <pip>
numpy                     1.26.2                    <pip>
opencv-python             4.8.0.76                  <pip>
opencv-python-headless    4.8.1.78                  <pip>
openpyxl                  3.0.3                     <pip>
openssl                   3.0.5                h4e544f5_0    conda-forge
packaging                 21.3                      <pip>
pandas                    1.1.3                     <pip>
parso                     0.8.3                     <pip>
pathlib2                  2.3.7                     <pip>
pexpect                   4.8.0                     <pip>
pickleshare               0.7.5                     <pip>
Pillow                    9.2.0                     <pip>
pip                       23.2.1           py39hd43f75c_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
platformdirs              2.5.2                     <pip>
pluggy                    1.0.0                     <pip>
prettytable               2.1.0                     <pip>
prometheus-client         0.8.0                     <pip>
prompt-toolkit            3.0.30                    <pip>
protobuf                  3.20.1                    <pip>
psutil                    5.7.0                     <pip>
ptyprocess                0.7.0                     <pip>
py                        1.11.0                    <pip>
pyclipper                 1.3.0.post5               <pip>
pycocotools               2.0.7                     <pip>
pycparser                 2.21                      <pip>
pycryptodome              3.10.1                    <pip>
Pygments                  2.12.0                    <pip>
pylint                    2.14.5                    <pip>
pyparsing                 3.0.9                     <pip>
pypng                     0.20220715.0              <pip>
pytest                    7.1.2                     <pip>
python                    3.9.13          h5016f1d_0_cpython    conda-forge
python-dateutil           2.8.2                     <pip>
python-slugify            8.0.1                     <pip>
pytz                      2022.4                    <pip>
pytz-deprecation-shim     0.1.0                     <pip>
PyWavelets                1.1.1                     <pip>
PyYAML                    5.3.1                     <pip>
pyzmq                     23.2.0                    <pip>
rapidfuzz                 3.5.2                     <pip>
readline                  8.1.2                h38e3740_0    conda-forge
requests                  2.31.0                    <pip>
requests-futures          1.0.0                     <pip>
s3transfer                0.3.7                     <pip>
scikit-learn              1.0.2                     <pip>
scipy                     1.11.4                    <pip>
semantic-version          2.8.5                     <pip>
seqeval                   1.2.2                     <pip>
setuptools                68.0.0           py39hd43f75c_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
setuptools-scm            8.0.4                     <pip>
Shapely                   1.8.4                     <pip>
six                       1.16.0                    <pip>
sqlite                    3.39.0               hc74f5b8_0    conda-forge
sqlparse                  0.4.3                     <pip>
sympy                     1.4                       <pip>
synr                      0.5.0                     <pip>
tabulate                  0.8.9                     <pip>
tenacity                  8.0.1                     <pip>
tensorflow-probability    0.10.1                    <pip>
terminaltables            3.1.0                     <pip>
text-unidecode            1.3                       <pip>
threadpoolctl             3.2.0                     <pip>
tifffile                  2021.11.2                 <pip>
tk                        8.6.12               hd8af866_0    conda-forge
toml                      0.10.1                    <pip>
tomli                     2.0.1                     <pip>
tomlkit                   0.11.5                    <pip>
topi                      0.4.0                     <pip>
tornado                   6.2                       <pip>
tqdm                      4.46.1                    <pip>
traitlets                 5.3.0                     <pip>
treelib                   1.6.1                     <pip>
typed-ast                 1.5.4                     <pip>
typing-inspect            0.8.0                     <pip>
typing_extensions         4.4.0                     <pip>
tzdata                    2023c                h04d1e81_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tzdata                    2022.7                    <pip>
tzlocal                   4.2                       <pip>
umap-learn-modified       0.3.8                     <pip>
urllib3                   2.0.4                     <pip>
wcwidth                   0.2.5                     <pip>
Werkzeug                  2.2.2                     <pip>
wheel                     0.38.4           py39hd43f75c_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
widgetsnbextension        4.0.5                     <pip>
wrapt                     1.14.1                    <pip>
xlrd                      1.2.0                     <pip>
XlsxWriter                3.0.3                     <pip>
xml-python                0.4.3                     <pip>
xmltodict                 0.12.0                    <pip>
xz                        5.2.5                h6dd45c4_1    conda-forge
yapf                      0.32.0                    <pip>
zipp                      3.8.1                     <pip>
zlib                      1.2.12               h4e544f5_1    conda-forge

是否有解决方案?谢谢!

您好,感谢您的反馈。
v0.3.x适配的是MindSpore r2.2.10及其后续bug fix版本,请考虑优先MindSpore r2.2.11。
根据您反馈的错误日志,疑似是数据后处理函数,与数据集格式适配存在bug。开发工程师正在进行debug。

您好,我们在MindSpore r2.2.11 release版本和MindOCR v0.3.1版本上,进行了测试,未复现您所提交的问题。
部分训练日志如后文所附。
建议您:

  1. 改为安装MindSpore r2.2.11,并通过requirements.txt安装对应的依赖包;
  2. 请检查在转换Total-Text数据集时,是否有错误信息。
(ms-dev) [psw@10-90-43-193 mindocr]$python tools/train.py -c configs/det/dbnet/db_r18_totaltext.yaml
[2024-02-01 01:45:11] mindocr.train INFO - Standalone training. Device id: 0, specified by system.device_id in yaml config file or is default value 0.
[2024-02-01 01:45:14] mindocr.data.builder INFO - Creating dataloader (training=True) for device 0. Number of data samples: 1255 per device (1255 total).
[2024-02-01 01:45:17] mindocr.data.builder INFO - Creating dataloader (training=False) for device 0. Number of data samples: 300 per device (300 total).
[2024-02-01 01:45:17] mindocr.models.utils.load_model INFO - Finish loading model checkoint from https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet18_synthtext-251ef3dd.ckpt. If no parameter fail-load warning displayed, all checkpoint params have been successfully loaded.
[2024-02-01 01:45:17] mindocr.optim.param_grouping INFO - no parameter grouping is applied.
[2024-02-01 01:45:22] mindocr.train INFO -
========================================
Distribute: False
Model: det_resnet18-DBFPN-DBHead
Total number of parameters: 12351042
Total number of trainable parameters: 12340930
Data root: /ms_test3/psw/code/mindocr
Optimizer: SGD
Weight decay: 0.0001
Batch size: 20
Num devices: 1
Gradient accumulation steps: 1
Global batch size: 20x1x1=20
LR: 0.007
Scheduler: polynomial_decay
Steps per epoch: 62
Num epochs: 1200
Clip gradient: False
EMA: True
AMP level: O0
Loss scaler: {'type': 'dynamic', 'loss_scale': 512, 'scale_factor': 2, 'scale_window': 1000}
Drop overflow update: False
========================================

Start training... (The first epoch takes longer, please wait...)

[2024-02-01 01:46:13] mindocr.utils.callbacks INFO - epoch: [1/1200], loss: 2.780020, epoch time: 50.894 s, per step time: 820.863 ms, fps per card: 24.36 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:13<00:00, 21.51it/s]
[2024-02-01 01:46:27] mindocr.utils.callbacks INFO - Performance: {'recall': 0.5787810383747178, 'precision': 0.8835286009648519, 'f-score': 0.6993998908892527}, eval time: 14.135913610458374
[2024-02-01 01:46:27] mindocr.utils.callbacks INFO - => Best f-score: 0.6993998908892527, checkpoint saved.
[2024-02-01 01:46:51] mindocr.utils.callbacks INFO - epoch: [2/1200], loss: 2.967140, epoch time: 23.089 s, per step time: 372.398 ms, fps per card: 53.71 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:13<00:00, 23.02it/s]
[2024-02-01 01:47:04] mindocr.utils.callbacks INFO - Performance: {'recall': 0.6072234762979684, 'precision': 0.9026845637583892, 'f-score': 0.7260458839406208}, eval time: 13.16214632987976
[2024-02-01 01:47:04] mindocr.utils.callbacks INFO - => Best f-score: 0.7260458839406208, checkpoint saved.
[2024-02-01 01:47:29] mindocr.utils.callbacks INFO - epoch: [3/1200], loss: 2.520107, epoch time: 24.156 s, per step time: 389.613 ms, fps per card: 51.33 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:14<00:00, 21.20it/s]
[2024-02-01 01:47:43] mindocr.utils.callbacks INFO - Performance: {'recall': 0.6261851015801354, 'precision': 0.8604218362282878, 'f-score': 0.7248497517637835}, eval time: 14.256935596466064
[2024-02-01 01:48:08] mindocr.utils.callbacks INFO - epoch: [4/1200], loss: 2.506505, epoch time: 24.326 s, per step time: 392.352 ms, fps per card: 50.97 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:12<00:00, 24.39it/s]
[2024-02-01 01:48:21] mindocr.utils.callbacks INFO - Performance: {'recall': 0.5431151241534988, 'precision': 0.9065561416729465, 'f-score': 0.6792772444946358}, eval time: 12.399577140808105
[2024-02-01 01:48:46] mindocr.utils.callbacks INFO - epoch: [5/1200], loss: 2.584358, epoch time: 24.736 s, per step time: 398.971 ms, fps per card: 50.13 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:12<00:00, 23.79it/s]
[2024-02-01 01:48:59] mindocr.utils.callbacks INFO - Performance: {'recall': 0.582844243792325, 'precision': 0.8909592822636301, 'f-score': 0.7046943231441047}, eval time: 12.713114023208618
[2024-02-01 01:49:24] mindocr.utils.callbacks INFO - epoch: [6/1200], loss: 2.986135, epoch time: 24.694 s, per step time: 398.292 ms, fps per card: 50.21 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:12<00:00, 23.93it/s]
[2024-02-01 01:49:37] mindocr.utils.callbacks INFO - Performance: {'recall': 0.6419864559819413, 'precision': 0.8915360501567398, 'f-score': 0.7464566929133857}, eval time: 12.641420125961304
[2024-02-01 01:49:37] mindocr.utils.callbacks INFO - => Best f-score: 0.7464566929133857, checkpoint saved.
[2024-02-01 01:50:02] mindocr.utils.callbacks INFO - epoch: [7/1200], loss: 2.739786, epoch time: 24.156 s, per step time: 389.615 ms, fps per card: 51.33 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:12<00:00, 24.54it/s]
[2024-02-01 01:50:14] mindocr.utils.callbacks INFO - Performance: {'recall': 0.6297968397291196, 'precision': 0.908203125, 'f-score': 0.743801652892562}, eval time: 12.314828634262085
[2024-02-01 01:50:38] mindocr.utils.callbacks INFO - epoch: [8/1200], loss: 2.367562, epoch time: 23.545 s, per step time: 379.756 ms, fps per card: 52.67 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:13<00:00, 22.80it/s]
[2024-02-01 01:50:51] mindocr.utils.callbacks INFO - Performance: {'recall': 0.5688487584650113, 'precision': 0.9217264081931237, 'f-score': 0.7035175879396984}, eval time: 13.259095668792725
[2024-02-01 01:51:16] mindocr.utils.callbacks INFO - epoch: [9/1200], loss: 2.317336, epoch time: 24.136 s, per step time: 389.283 ms, fps per card: 51.38 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:14<00:00, 20.05it/s]
[2024-02-01 01:51:31] mindocr.utils.callbacks INFO - Performance: {'recall': 0.6781038374717833, 'precision': 0.8637147786083956, 'f-score': 0.7597369752149721}, eval time: 15.05770754814148
[2024-02-01 01:51:31] mindocr.utils.callbacks INFO - => Best f-score: 0.7597369752149721, checkpoint saved.
[2024-02-01 01:51:55] mindocr.utils.callbacks INFO - epoch: [10/1200], loss: 2.895481, epoch time: 23.572 s, per step time: 380.187 ms, fps per card: 52.61 img/s
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:11<00:00, 25.77it/s]
[2024-02-01 01:52:07] mindocr.utils.callbacks INFO - Performance: {'recall': 0.636117381489842, 'precision': 0.9066924066924067, 'f-score': 0.7476784292915892}, eval time: 11.734557867050171

由于未能复现您所提问题,本Issue暂时关闭。
请您尝试安装MindOCR所适配的MindSpore版本。
如有进一步的问题,请与我们联系。