AI4Bharat / DocSim

Synthetically generate random text document images with ground-truth

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DocSim -- Documents Simulator

Synthetically generate random text documents with ground truth!
Check here for list of all features.

Note:
This project is only for research purposes like this.

Demo

Template Generated Augmented

Requirements

Example Usage

Generate synthetic images

python generate.py <template.json> <num_samples> <output_folder>

Check the templates/ folder for sample document templates.

Augment generated images

python augment.py <config.json> <input_folder> <num_epochs> <output_folder> <num_workers>

Check documentation/Augmentation for more details.


Demo Web UI

Ensure you have installed StreamLit by pip install streamlit.

Generator UI

UI to generate document using desired template by filling data manually (for demo purpose)

streamlit run generator_ui.py

ToDo: (Contributions welcome)

  • Add augmentation support in UI
  • Create another UI for creating templates.

Footnotes

For any problems or queries, please report under the "Issues" tab.
Feel free to contribute by sending a Pull Request.

Other similar libraries

About

Synthetically generate random text document images with ground-truth

License:GNU General Public License v3.0


Languages

Language:Python 100.0%