Refactor convolver to apply alpha premultiply
puzrin opened this issue · comments
Vitaly Puzrin commented
Ref: #60 (comment)
Thanks to @chebum, for solution about proper alpha processing
Intro
.getImageData()
provides NOT premultiplied raw data..putImageData()
requires NOT premultiplied raw data.
For proper alpha processing, data should be premultiplied before convolve, and un-premultiplaed after.
Implementation notes
- Remove
.alpha
option in favor of autodetect (if all alpha bytes are 255 => skip alpha processing). - Split convolver functions to separate, with/without alpha. That should simplify optimizations and keep precision for images without alpha channel.
- Consider intermediate result 15 bits. Will eat more memory, but should help with precision, especially when premultiply used.
- In JS it may be better to use table lookup for premultiplication to avoid division and keep math integer.
Vitaly Puzrin commented
@chebum i've pushed to dev
branch refactored js code (currently without wasm).
- All desired optimizations done.
- Precision between convolver passes increased to 16bits.
- Pre-check if alpha data exists and select appropriate methods (skip premultiply when no transparency)
Here are benchmarks
Before rewrite:
> [js] resize 1024x1024 => 153x153 x 22.29 ops/sec ±3.30% (55 runs sampled)
> [wasm] resize 1024x1024 => 153x153 x 35.30 ops/sec ±1.93% (79 runs sampled)
> mm js resize 1024x1024 => 153x153 x 27.96 ops/sec ±1.74% (49 runs sampled)
> [js] unsharp 1024x1024 x 17.96 ops/sec ±1.08% (48 runs sampled)
> [wasm] unsharp 1024x1024 x 34.79 ops/sec ±1.16% (60 runs sampled)
> Build filters for 1024x1024 x 722 ops/sec ±1.21% (87 runs sampled)
After rewrite:
> [js] resize (1024x1024 => 153x153) x 21.42 ops/sec ±4.47% (54 runs sampled)
> [js] resize & premultiply (1024x1024 => 153x153) x 18.36 ops/sec ±2.73% (53 runs sampled)
> [wasm] resize (1024x1024 => 153x153) x 35.80 ops/sec ±1.31% (80 runs sampled)
> [wasm] resize & premultiply (1024x1024 => 153x153) x 35.92 ops/sec ±1.32% (80 runs sampled)
> mm js resize (1024x1024 => 153x153) x 22.83 ops/sec ±2.94% (42 runs sampled)
> mm js resize & premultiply (1024x1024 => 153x153) x 18.94 ops/sec ±3.45% (36 runs sampled)
> [js] unsharp (1024x1024) x 17.33 ops/sec ±1.77% (47 runs sampled)
> [wasm] unsharp (1024x1024) x 34.22 ops/sec ±2.03% (59 runs sampled)
> Build filters for (1024x1024) x 859 ops/sec ±3.87% (76 runs sampled)
Seems good. Now have to rewrite C code the same way.
Vitaly Puzrin commented
Complete.