DALAI-project's repositories
Document_segmentation
A model for segmenting scanned document images.
Arkkiivi_UI
User interface for Arkkiivi web application
Document-analysis_API
This API use Annif as local server, NER component is included. It also includes Tesseract and uses Apache-tika software for language detection. It also has a limited multilingual support.
Empty_training
Training code for a deep learning model that detects empty document from images.
FaultyImageAPI
API that combines empty page, post-it, folded corner and writing type detection models.
NER_API
API for performing named entity recognition from text input in Finnish.
Table_segmentation
Code for segmenting table structures and detecting text content in document images.
Train_BERT_NER
Code for training Finnish named entity recognition (NER) model based on BERT.
Train_writing_type
Training code for a deep learning model that detects document writing type from images.
WritingtypeAPI
Repo for writingtype classifier API
Annif_API
Instructions and pretrained models for using Annif (https://annif.org/) software for automatic subject indexing as local service.
Train_document_classification
Code that can be used for training a neural network model to classify input documents into distinct classes.
Train_fault_detection
Code that can be used for training a neural network model to detect faults (sticky notes, folded corners etc.) in input documents.