ysig / learnable-typewriter

The Learnable Typewriter: A Generative Approach to Text Line Analysis

Home Page:http://imagine.enpc.fr/~siglidii/learnable-typewriter/

Repository from Github https://github.comysig/learnable-typewriterRepository from Github https://github.comysig/learnable-typewriter

teaser.png

The Learnable Typewriter
A Generative Approach to Text Analysis

Official PyTorch implementation of The Learnable Typewriter: A Generative Approach to Text Ξ‘nalysis.
Authors: Yannis Siglidis, Nicolas Gonthier, Julien Gaubil, Tom Monnier, Mathieu Aubry.
Research Institute: Imagine, LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-VallΓ©e, France
ICDAR 2024 (Best Paper Award).

Install 🌱

conda create --name ltw pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda activate ltw
python -m pip install -r requirements.txt

Datasets β˜€οΈ Models πŸ”¨

Dropbox: Download & extract datasets.zip and runs.zip in the parent folder.
Huggingface: python scripts/download-hf.py

Inference πŸ‘

For minimal inference and plotting we provide a standalone notebook. Open in Colab

To reproduce the figures of the paper run the scripts/figures.ipynb notebook.

Helper scripts are also provided to perform evaluation on the corresponding datasets:

python scripts/eval.py -i <MODEL-PATH> {--eval, --eval_best}

and produce figures and sprites for certain samples:

python scripts/eval.py -i <MODEL-PATH> -s {train, val, test} -id 0 0 0 -is 1 2 3 --plot_sprites

Training 🌼

Training and model configure is performed though hydra. We supply the corresponding config files for all our baseline experiments.

Google πŸ“°

python scripts/train.py supervised-google.yaml
python scripts/train.py unsupervised-google.yaml

Copiale πŸ“œ

python scripts/train.py supervised-copiale.yaml
python scripts/train.py unsupervised-copiale.yaml

Fontenay β›ͺ

python scripts/train.py supervised-fontenay.yaml

and finetune with:

python scripts/fontenay.py -i fontenay/fontenay/<MODEL_NAME> -o fontenay/fontenay-ft/ --max_epochs 150 -k "training.optimizer.lr=0.001"

To all of the above experiment config files, additional command line overrides could be applied to further modify them using the hydra syntax.

Custom Dataset πŸ’Ύ

Trying the LT on a new dataset is dead easy.

First create a config file:

configs/<DATASET_ID>.yaml

...

DATASET-TAG:
  path: <DATASET-NAME>/
  sep: ''                    # How the character separator is denoted in the annotation. 
  space: ' '                 # How the space is denoted in the annotation.

Then create the dataset folder:

datasets/<DATASET-NAME>
β”œβ”€β”€ annotation.json
└── images
  β”œβ”€β”€ <image_id>.jpg
  └── ...

The annotation.json file should be a dictionary with entries of the form:

    "<image_id>": {
        "split": "train",                            # {"train", "val", "test"} - "val" is ignored in the unsupervised case.
        "label": "A beautiful calico cat."           # The text that corresponds to this line.
    },

You can completely ignore the annotation.json file in the case of unsupervised training without evaluation.

Logging πŸ“‰

Logging is done through tensorboard. To visualize results run:

tensorboard --logdir ./<run_dir>/

If you want to dive in deeper, check out our experimental features.

Citing πŸ’«

@misc{the-learnable-typewriter,
	title = {The Learnable Typewriter: A Generative Approach to Text Line Analysis},
	author = {Siglidis, Ioannis and Gonthier, Nicolas and Gaubil, Julien and Monnier, Tom and Aubry, Mathieu},
	publisher = {arXiv},
	year = {2023},
	url = {https://arxiv.org/abs/2302.01660},
	keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
	doi = {10.48550/ARXIV.2302.01660},
	copyright = {Creative Commons Attribution 4.0 International}
}

Also check out 🌈

If you like this project, have also a look to related work produced by our team:

Acknowledgements ✨

We would like to thank Malamatenia Vlachou and Dominique Stutzmann for sharing ideas, insights and data for applying our method in paleography; Vickie Ye and Dmitriy Smirnov for useful insights and discussions; Romain Loiseau, Mathis Petrovich, Elliot Vincent, Sonat BaltacΔ± for manuscript feedback and constructive insights. This work was partly supported by the European Research Council (ERC project DISCOVER, number 101076028), ANR project EnHerit ANR-17-CE23-0008, ANR project VHS ANR-21-CE38-0008 and HPC resources from GENCI-IDRIS (2022-AD011012780R1, AD011012905).

About

The Learnable Typewriter: A Generative Approach to Text Line Analysis

http://imagine.enpc.fr/~siglidii/learnable-typewriter/

License:Other


Languages

Language:Python 92.9%Language:Jupyter Notebook 7.1%