openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.

Home Page:https://anomalib.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about inference input normalization

blueclowd opened this issue · comments

Hello, i notice the latest document of inference may need a minor modification
I found that in .predict() there is a normalization step if the input is image path. So should we manually normalize it if the input is a numpy array?

The inference script (in current document)

import cv2
image = cv2.imread("path/to/image.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
result = inferencer.predict(image)

should it be

import cv2
image = cv2.imread("path/to/image.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)/255.0
result = inferencer.predict(image)

the output of above script is same as using the image path directly

result = inferencer.predict("path/to/image.jpg")

In a project I'm working on I first started with loading images from files, but later I switched to using a numpy array, I also found out that I had to normalize the intensity values.

@blueclowd, you don't need to normalize the image. It is done here in the predict method:

if isinstance(image, str | Path):
image = np.array(Image.open(image)).astype(np.float32) / 255.0

if this documentation causes a confusion, maybe we could remove image reading part.

To sum up, there are two types of supported input format:

  1. If the input is numpy array, we need to normalize it to make it between 0 and 1
  2. If the input is image path, we don't need to normalize it cause it is done in the predict method

Its always good to support multiple types of input format, maybe just need to mention the manual normalization step if the input is numpy array?

@blueclowd, I have now modified the logic in the PR #1875. If you are using OpenVINO for inference, the only thing that you need to do is to pass image path, PIL or np images. No normalization is needed. It is done under the hood