gabrielarchanjo / marvinj

Javascript Image Processing Framework based on Marvin Framework

Home Page:https://www.marvinj.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Gaussian Blur implementation is very slow

EhsanKia opened this issue · comments

On a simple 1080x800px image, with kernel of size 10, it takes well over 15s. Not sure if there's more efficient ways to implement it, is it the same method used as the Java version?

I wonder if it's possible to use the browser's CSS filter for guassian blur, then transfer that back to a canvas.

I looked through the code that does Gaussian blur, and the efficiency is at least O(n)^4 or larger. There is two for loops to iterate through the image, and there are two for loops to iterate through the kernel. It is possible to completely unroll the for loops, as in make them disappear, for iterating through the image kernel. That would make the efficiency at O(n)^2.

Example in Python:

gausKernel = [[1/16, 2/16, 1/16], [2/16, 4/16, 2/16], [1/16, 2/16, 1/16]]
accum = 0
data2 = copy.deepcopy(data)
for i in range(1, data.shape[0] - 1):
    for j in range(1, data.shape[1] - 1):
        accum += gausKernel[0][0] * data2[i - 1][j - 1][0]
        accum += gausKernel[0][1] * data2[i - 1][j][0]
        accum += gausKernel[0][2] * data2[i - 1][j + 1][0]
        accum += gausKernel[1][0] * data2[i][j - 1][0]
        accum += gausKernel[1][1] * data2[i][j][0]
        accum += gausKernel[1][2] * data2[i][j + 1][0]
        accum += gausKernel[2][0] * data2[i + 1][j - 1][0]
        accum += gausKernel[2][1] * data2[i + 1][j][0]
        accum += gausKernel[2][2] * data2[i + 1][j + 1][0]
        accum = round(accum)
        data2[i][j][0] = accum
        data2[i][j][1] = accum
        data2[i][j][2] = accum
        accum = 0
        
        
image3 = Image.fromarray(data2)
display(image3)

For variable kernel sizes, it would obviously make the code larger to maintain since you would need multiple functions.

You're not magically reducing the time complexity by unrolling the inner loops. It wasn't really O(n^4) before, it was O(w*h*k^2) and it still is after your change. But yeah this is a very slow operation you'd probably want to do on the gpu in parallel.