[bechmark] Add benchmark for all operations

Question

[bechmark] Add benchmark for all operations

ternaus opened this issue 4 months ago · comments

Vladimir Iglovikov commented 4 months ago

It would be great to have a benchmark in Readme and / or documentation for kornia-rs vs OpenCV for all similar operations.

In https://dev.to/viglovikov/jpeg2rgb-array-showdown-libjpeg-turbo-vs-kornia-rs-vs-tensorflow-vs-torchvision-2mnh

it was unexpected to see OpenCV being twice slower. Similar numerical comparisons for other operations would be ultra useful in decisions about kornia-rs integrations to other packages.

Edgar Riba · Answer 1 · Sun Mar 17 2024 01:18:09 GMT+0800 (China Standard Time)

@ternaus thanks for raising interest. I wrote an initial benchmark here but for the resize function only and not well designed nor using the proper tools: https://github.com/kornia/kornia-rs/blob/main/py-kornia/benchmark/resize_benchmark.py

i will create something more stable starting by the minimum functionality we expose in the python api. Besides, would be great if you can help to identify what functions you are interested in that we can port/measure from opencv.

Vladimir Iglovikov · Answer 2 · Tue Mar 19 2024 05:08:46 GMT+0800 (China Standard Time)

Thanks!

Currently, in Albumentations we use:

cv2.inpaint
cv2.GaussianBlur
cv2.line
cv2.cvtColor
cv2.imread
cv2.resize
cv2.addWeighted
cv2.LUT
cv2.meanStdDev
cv2.circle
cv2.blur
cv2.imdecode
cv2.createCLAHE
cv2.transform
cv2.calcHist
cv2.equalizeHist
cv2.merge
cv2.multiply
cv2.subtract
cv2.flip
cv2.perspectiveTransform
cv2.warpAffine
cv2.getRotationMatrix2D
cv2.getAffineTransform
cv2.distanceTransform
cv2.threshold
cv2.Canny

These grep shows me about OpenCV, and many other that are done on the numpy level, but, I guess, could, in theory, be done faster with kornia-rs

Edgar Riba · Answer 3 · Tue Mar 19 2024 14:56:39 GMT+0800 (China Standard Time)

Awesome, give a look at the rust documentation because few of them are already there but not exposed to python yet. I’ll be moving slowly so that we can benchmark each of them. Besides, almost each of them are already implemented in Kornia torch which the strategy is to execute in batch /gpu the augmentations pipeline.

Edgar Riba · Answer 4 · Tue Mar 19 2024 14:59:22 GMT+0800 (China Standard Time)

And abit of curiosity, why do you use Canny since it’s an expensive edge detection operator. And why multiply, subtract, merge, meanStdDev why can be done directly in numpy, or is the last slow ?

Vladimir Iglovikov · Answer 5 · Wed Mar 20 2024 03:21:38 GMT+0800 (China Standard Time)

Strory behind adding Canny I do not remember - I did not even know that it is slow since you told me :)

For numpy vs. cv2, cv2 could be a few times faster for some data types than a similar numpy operation.

For example, in the recent benchmark vs Kornia, torchvision, Augly, ImgAug, we found that we are not the fastest for GaussianNoise.

Checked the code:

def gauss_noise(image: np.ndarray, gauss: np.ndarray) -> np.ndarray:
    image = image.astype("float32")
    return image + gauss

And it does not look cv2 optimized at all. I did not benchmark, but this code could work faster.

def gauss_noise_optimized(image: np.ndarray, gauss: np.ndarray) -> np.ndarray:
    if image.dtype == np.float32:
        gauss = gauss.astype(np.float32)
        noisy_image = cv2.add(image, gauss)
    elif image.dtype == np.uint8:
        gauss = np.clip(gauss, 0, 255).astype(np.uint8)
        noisy_image = cv2.add(image, gauss)
    else:
        raise TypeError("Unsupported image dtype. Expected uint8 or float32.")
    return noisy_image