seekingdeep / SDL-Document-Image-Generation

SDL: Synthetic Document Image Generation, allowing generation of multi-level annotation for document image.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SDL: Synthetic Document Layout dataset

SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.

Sample image

image

Structure of data

structure

Quick start

python flexible_layout.py --config_file configs/page.yaml

Instruction to run data generation

Go to instruction

Visualization of the result

python data_manipulation/visualize.py

Vietnamese 300000 images link:

Release soon

Paper

https://arxiv.org/abs/2106.15117

About

SDL: Synthetic Document Image Generation, allowing generation of multi-level annotation for document image.


Languages

Language:Python 100.0%