The system is designed to:
- Take a raw P&ID diagram sheet as input
- Output the following:
- A JPG image with bounding boxes drawn around each detected symbol with their object ID
- A PDF file containing the object ID, class ID, Component Name, Item Label and bounding box regions
PyTorch, MMOCR, EAST text detection, OpenCV, PIL, Numpy
The system works in 2 stages:
- Custom object detection step:
- We traverse the entire P&ID sheet and process crops of size 150 x 150. (50% overlap of crops is considered).
- Our custom trained model provides 6 symbol classes 1 background class as inference.
- We reject background classes and save only the regions which contain the symbols.
- Several overlapping bounding boxes are detected, which are processed to provide a single bounding box and inference.
- OCR and text processing step:
- The regions containing symbols are passed through EAST text detection model (pretrained) to determine the orientation of the text i.e vertical or horizontal.
- The vertical text is roatated 90° right and brought into a horizontal format.
- The processed text regions are cropped and passes through the MMOCR mode (pretrained) for performing OCR.
- Rubbish inferences are rejected and replaced with defaut values.
The solution has been tested on Ubuntu 20.04, Mac OS Montery, Windows 10 and 11 (Git Bash) and Google Colab.
conda create -n PnID python=3.7.13
conda activate PnID
git clone https://github.com/aneeshbhattacharya/Automated-PnID-Symbol-Detection-and-Labelling.git
cd Automated-PnID-Symbol-Detection-and-Labelling
sudo chmod +x build.sh
./build.sh
https://www.dropbox.com/s/r2ingd0l3zt8hxs/frozen_east_text_detection.tar.gz?dl=1
Put the file in the MMOCR directory
cd main_driver
jupyter-notebook
Open Aneesh_Risav_P&ID_Detection_and_Labelling_System.ipynb
- Invert the images to create greyscale and normalized images (symbols are active pixels) and create a dataset (use invert function from Aneesh_Risav_P&ID_Detection_and_Labelling_System.ipynb)
- 2. Save different images of different classes as follows:
|--main folder
|----0 <- this is class ID
|------1.jpg
|------2.jpg
|------3.jpg
|------...
|----1
|------1.jpg
|------2.jpg
|------3.jpg
|------...
|----.
|----.
|----n <- For upto N classes - Place this folder with name 'dataset' in the same directory as FINAL_Inverted_PyTorch_Model_P&ID.ipynb
- Set the number of classes output required by the model as "n" (last softmax layer of model)
- Run FINAL_Inverted_PyTorch_Model_P&ID.ipynb