Kahsolt / CCF-BDCI-2024-TPU-ocr-deploy

CCF BDCI 2024 基于TPU平台的OCR模型性能优化

Repository from Github https://github.comKahsolt/CCF-BDCI-2024-TPU-ocr-deployRepository from Github https://github.comKahsolt/CCF-BDCI-2024-TPU-ocr-deploy

CCF-BDCI-2024-TPU-ocr-deploy

CCF BDCI 2024 基于TPU平台的OCR模型性能优化

Contest page: https://www.datafountain.cn/competitions/1044
Team Name: 识唔识得
Award: 二等奖 (第二名,不得要领...)

ℹ 本仓库部署 PaddleOCR 项目到 MilkV-Duo 板上运行

性能评估 & 对比

⚪ A榜 (ICDAR2019-LVST, n_sample=2350)

ℹ 单元格内数值为 f-score/precsion/recall : valid_infer_time/contest_infer_time : real_fps

det rec CPU + onnx (fp32) TPU + cvimodel (int8 + bf16) valid score contest score comment
v4 v4 0.60724/0.78855/0.49372 : 76.05 not run on chip
v3 v3 0.57585/0.80885/0.44707 : 58.68 not run on chip
v2 v2 0.52051/0.78323/0.38977 : 43.02 0.44099/0.65593/0.33215 : 719.539/333.937 : 0.88 46.47884 79.25498 too slow
v3 mb 0.54064/0.79092/0.41069 : 58.13 0.42010/0.54821/0.34052 : 433.201/277.942 : 1.22 69.98178 83.17896 slow
v2 mb 0.49098/0.78041/0.35815 : 41.59 0.42781/0.63849/0.32166 : 367.944/256.211 : 1.42 75.83703 85.33433 (⭐) the most balanced solution
mb mb 0.34883/0.69257/0.23312 : 41.61 0.32475/0.57048/0.22698 : 344.309/256.930 : 1.47 73.72358 81.15095 too wrong
注: 
- valid score 计算中: infer_time = ts_det_infer + ts_rec_infer * (n_crop / n_img), 似乎应用上更合理
- contest score 计算中: infer_time = ts_det_infer + ts_rec_infer, 应比赛要求
- real_fps = n_img / (ts_total - (ts_model_load + ts_model_unload)) * 1000

一些(确实很!!)难蚌的比赛刷分设置,调整精度-时间平衡:

input size f1 infer_time real_fps score comment
640 0.42781 256.211 1.42 85.33433 v2-mb baseline
480 0.33901 155.279 1.885 90.36170 综合考虑最优 ⭐
320 0.20613 75.951 2.954 91.78934 很快,但质量下降很厉害

⚪ B榜 (MSRA-TD500, n_sample=500)

input size infer_time real_fps comment
640 313.969 0.42 原图较大,ts_det_infer 比 A 榜多
480 198.213 0.52 相比 640 质量下降不大 (⭐)
320 115.033 0.58 相比 480 有文本框粘连/更多的漏检;原图较大,load_img 严重拉低了 real_fps

⚪ B2榜 (unknown, n_sample=3992 (resampled under 640x640))

input size infer_time real_fps comment
640 255.359 1.96 提前降采样后少了很多mem swap,总体吞吐量提高
480 152.944 2.70

环境搭建

⚪ 资源获取 (run on Windows)

  • downloads\download.cmd
  • git clone https://github.com/Kahsolt/tpu-sdk-cv180x-ocr

⚪ 上位机 (模型编译, 本仓库!)

ℹ 可跳过,直接使用我预编译的模型 tpu-sdk-cv180x-ocr/cvimodels

  • 下载并转换模型: paddle -> onnx (run on Windows)
    • pip install -r requirements.txt
    • run models\download_and_convert.cmd
  • 编译模型文件: onnx -> cvimodel (run in Docker container tpu-mlir)
    • bash ./compile_cvimodel_all.sh

⚪ 上位机 (运行时编译, 子仓库 tpu-sdk-cv180x-ocr)

ℹ 可跳过,直接使用我预编译的运行时 tpu-sdk-cv180x-ocr/samples/ppocr_*

  • 参考各子项目的说明文件 tpu-sdk-cv180x-ocr/samples/ppocr_*/README.md

references


by Armit 2024/09/14

About

CCF BDCI 2024 基于TPU平台的OCR模型性能优化

License:MIT License


Languages

Language:Python 51.8%Language:MLIR 27.8%Language:Shell 8.8%Language:C++ 8.3%Language:Batchfile 3.2%