- This is a tools for ocr dataset, text detection, fonts classification dataset generate.
- This is the most convenient tool for generating ocr data, text detection data, and font recognition
-Generate text maps with different fonts, font sizes, colors, and rotation angles based on different corpora -Support multi-process fast generation -The text map is filled into the layout block according to the specified layout mode -Find smooth areas in the image as layout blocks -Support the extraction and export of blocks in the text area (export json file, txt file and picture file, can generate voc data, ICDAR_LSVT data set format!) -Support annotations for each text level (stored in the json file of lsvt) -Support users to configure various generation configurations (image reading, generation path, various probabilities)
-Environment installation (Python3.6+, conda environment is recommended)
```
# step 1
pip install requirements.txt
# step 2
sh make.sh
```
-Edit the configuration file config.yml
(optional)
-
Execute build script
python3 run.py
-Generated data
The generated data is stored in the directory specified by `provider> layout> out_put_dir` in `config.yml`.