SauvolaNet: Learning Adaptive Sauvola Network

This is the official repo for the SauvolaNet (ICDAR2021). For details of SauvolaNet, please refer to

@INPROCEEDINGS{9506664,  
  author={Li, Deng and Wu, Yue and Zhou, Yicong},  
  booktitle={The 16th International Conference on Document Analysis and Recognition (ICDAR)},   
  title={SauvolaNet: Learning Adaptive Sauvola Network for Degraded Document Binarization},   
  year={2021},  
  volume={},  
  number={},  
  pages={538–553},  
  doi={https://doi.org/10.1007/978-3-030-86337-1_36}}

Thanks to @mohamadmansourX, we have custom training of SauvolaNet. For more detail, please visit to this link.

Overview

SauvolaNet is an end-to-end document binarization solution. It is optimal for three hyper-parameters of the classic Sauvola algorithm. Compare with existing solutions, SauvolaNet has followed advantages:

SauvolaNet do not have any Pre/Post-processing
SauvolaNet has comparable performance with SoTA
SauvolaNet has a super lightweight network structure and faster than DNN-based SoTA

More precisely, SauvolaNet consists of three modules, namely, Multi-window Sauvola (MWS), Pixelwise Window Attention (PWA), and Adaptive Sauolva Threshold (AST).

MWS generates multiple windows of different size Sauvola with trainable parameters
PWA generates pixelwise attention of window size
AST generates pixelwise threshold by fusing the result of MWS and PWA.

Dependency

LineCounter is written in TensorFlow.

TensorFlow-GPU: 1.15.0
keras-gpu 2.2.4

Other versions might also work but are not tested.

Demo

Download the repo and create the virtual environment by following commands

conda create --name Sauvola --file spec-env.txt
conda activate Sauvola
pip install tensorflow-gpu==1.15.0
pip install opencv-python
pip install parse

Then play with the provided ipython notebook.

Alternatively, one may play with the inference code using this google colab link.

Datasets

We do not own the copyright of the dataset used in this repo.

Below is a summary table of the datasets used in this work along with a link from which they can be downloaded:

Dataset	URL
DIBCO 2009	http://users.iit.demokritos.gr/~bgat/DIBCO2009/benchmark/
DIBCO 2010	http://users.iit.demokritos.gr/~bgat/H-DIBCO2010/benchmark/
DIBCO 2011	http://utopia.duth.gr/~ipratika/DIBCO2011/benchmark/
DIBCO 2012	http://utopia.duth.gr/~ipratika/HDIBCO2012/benchmark/
DIBCO 2013	http://utopia.duth.gr/~ipratika/DIBCO2013/benchmark/
DIBCO 2014	http://users.iit.demokritos.gr/~bgat/HDIBCO2014/benchmark/
DIBCO 2016	http://vc.ee.duth.gr/h-dibco2016/benchmark/
DIBCO 2017	https://vc.ee.duth.gr/dibco2017/
DIBCO 2018	https://vc.ee.duth.gr/h-dibco2018/
PHIDB	http://www.iapr-tc11.org/mediawiki/index.php/Persian_Heritage_Image_Binarization_Dataset_(PHIBD_2012)
Bickely-diary dataset	https://www.comp.nus.edu.sg/~brown/BinarizationShop/dataset.htm
Synchromedia Multispectral dataset	http://tc11.cvc.uab.es/datasets/SMADI_1
Monk Cuper Set	https://www.ai.rug.nl/~sheng/

Concat

For any paper-related questions, please feel free to contact leedengsh@gmail.com.

Leedeng / SauvolaNet