OCR4all - Docker image

Provides OCR (optical character recognition) services through web applications

Getting Started

These instructions will get you a Docker container that runs the project

Prerequisites

Docker (for installation instructions see the Official Installation Guide)

Installing

Get the Docker Image

From Docker Hub:

Execute the following command docker pull uniwuezpd/ocr4all

From Source:

Download the Dockerfile first and enter the directory that contains it with a command line tool.
Execute the following command inside the directory: docker build -t <IMAGE_NAME> .

(We recommend uniwuezpd/ocr4all as image name)

Initialize Container

With the help of the image a container can now be created with the following command:

docker run \
    -p 8080:8080 \
    -u `id -u root`:`id -g $USER` \
    --name ocr4all \
    -v <OCR_DATA_DIR>:/var/ocr4all/data \
    -v <OCR_MODEL_DIR>:/var/ocr4all/models/custom \
    -it <IMAGE_NAME>

Explanation of variables used above:

<IMAGE_NAME> - Name of the Docker image e.g. uniwuezpd/ocr4all
<OCR_DATA_DIR> - Directory in which the OCR data is located on your local machine
<OCR_MODEL_DIR> - Directory in which the OCR models are located on your local machine

The container will be started by default after executing the docker run command.

If you want to start the container again later use docker ps -a to list all available containers with their Container IDs and then use docker start -ia ocr4all to start the desired container.

You can now access the project via following URL: http://localhost:8080/ocr4all/

Updating

From Docker Hub:

Updating the image can easily be done via the docker hub if the image has been previously pulled from the docker hub.

The following command will update the image:

docker pull uniwuezpd/ocr4all

From Source:

To update the source code of the project you currently need to reinstall the image.

This can be achieved with executing the following command first:

docker image rm <IMAGE_NAME>

Afterwards you can follow the installation guide above as it is a new clean installation.

Development

In case you want shell access on your Docker container for development or testing purposes the container needs to be created with the following command (including the --entrypoint option):

docker run \
    -p 8080:8080 \
    --entrypoint /bin/bash \
    -v <OCR_DATA_DIR>:/var/ocr4all/data \
    -v <OCR_MODEL_DIR>:/var/ocr4all/models/custom \
    -it <IMAGE_NAME>

The container will be started by default after executing the docker run command.

If you want to start the container again later use docker ps -a to list all available containers with their Container IDs and then use docker start <CONTAINER_ID> to start the desired container. To gain shell access again use docker attach <CONTAINER_ID>.

Because the entrypoint has changed, processes will not start automatically and the following command needs to be executed after the container startup:

/usr/bin/supervisord

For information on how to update the project take a look into the commands within the Dockerfile.

Built With

Docker - Platform and Software Deployment
Maven - Dependency Management
Spring - Java Framework
Materialize - Front-end Framework
jQuery - JavaScript Library

Included Projects

LAREX - Layout analysis on early printed books
OCRopus - Collection of document analysis programs
calamari - OCR Engine based on OCRopy and Kraken

OCR4all / docker_image

OCR4all - Docker image

Getting Started

Prerequisites

Installing

Get the Docker Image

Initialize Container

Updating

From Docker Hub:

From Source:

Development

Built With

Included Projects

About

Languages