FelixEdwards / text_renderer

Generate text images for training deep learning ocr model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

项目简介

基于原text_render 根据自己的需求做了一些修改; 主要增加了对于色彩的支持(包含文字色彩与背景色彩图); Support both latin and non-latin text.

  • Ubuntu 16.04
  • python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Python3 main.py

Demo

example1.jpg example2.jpg

example3.jpg example4.jpg

文字变换特效

编辑 configs/default.yaml 来实现文字弯曲,模糊等效果,详见原text_render; 新增表格线颜色控制,字体颜色控制,添加冗余文字

Effect name Image
Origin(Font size 25) origin
Perspective Transform perspective
Random Crop rand_crop
Curve curve
Light border light border
Dark border dark border
Random char space big random char space big
Random char space small random char space small
Middle line middle line
Table line table line
Table line(offset) table line offset
Under line under line
Emboss emboss
Reverse color reverse color
Blur blur
font_color font_color
line_color line_color
extra_words extra_words
texture texture

颜色配置说明: 程序会随机拾取下方所配置颜色种类(按照fraction中的概率,所有颜色fraction相加需要等于1),选中颜色种类后,rgb由l_boundary与h_boundary中取一个随机数确定; font_color 同理

line_color:
  black:
    fraction: 0.3
    l_boundary: 0,0,0
    h_boundary: 64,64,64
  gray:
    fraction: 0.7
    l_boundary: 77,77,77
    h_boundary: 230,230,230

冗余文字说明: 如开启选项,程序会按照设定的总概率随机选取一段独立的文字(颜色,字体及字号大小均采用正文的设置),生成的文字长度在1到最长文本长度之间随机。 top 与 bottom 中的 fraction 决定了冗余文字生成在上方或下方的概率; *注:由于冗余文字尽量生成在上方与下方,所以可能生成在图片之外,所以最终出现的概率会小于设定值;

extra_words:
  enable: true
  fraction: 0.05
  top:
    fraction: 0.5
  bottom:
    fraction: 0.5

文字纹理说明: 开启选项后,会在文本上生成perling噪音的侵蚀效果(速度较慢)。可以调节octaves,persistence,scale参数调整纹理的效果(改变似乎不是非常明显)

texture:
  enable: true
  fraction: 0.05

  cloud:
    enable: true
    fraction: 1
    octaves: 2
    persistence: 0.4
    scale : 2000  

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

bad_example1

bad_example2

bad_example3

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['', '', '广', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '','', '', '', ''...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

debug_demo

Todo

  • 增加同一张图中不同字体、大小与颜色的设置;
  • 增加文字侵蚀,打印不完整的效果;

About

Generate text images for training deep learning ocr model

License:Other


Languages

Language:Python 89.7%Language:C++ 10.1%Language:Dockerfile 0.1%