nodeca / pica

Right now you’re using a standard kernel + blur filter. Would be cool to use the magic kernel in order to get better image quality. See this post for the explanation of how that work: http://www.johncostella.com/magic/

Thanks for all the great work in your lib <3

I don't believe in magic without proofs :). Could you provide a list of apps/packages, using that kernel?

PS. IMO existing lanczos 2-3 + sharp is enougth. For better quality i'd recommend server side processing.

PPS. Probably, it's possible to add

{
  filter: Array(filter_size),
  win: window_size
}

Default filters (functions) are difficult to transfer, if passed as option.

Facebook and Instagram use Magic Kernel 3 with Sharp 3, server-side, and Justine Tunney reimplemented it for printvideo https://justine.lol/printvideo.html. The general Magic Kernel Sharp is more efficient than Lanczos 2 or 3 for the same quality, or, looked at the other way, better quality for the same computational cost, especially with regard to aliasing artifacts. Full details in http://johncostella.com/magic/mks.pdf but there's a lot of stuff in there.

@j-p-c, probably, i could not explain my intents well. There are lot of kernel alternatives (symmetric and non-sypmmetric). I have no experience to create quality estimates, so need some practical criteria:

suggested feature should be well known & landed wide

I see this:

provided links are of 2021 (very young)
not too much results in google
no technical details about facebook/instagram
no this kernel in popular image processing libraries

I don't mean this kernel is bad/useless. I just don't see lot of evidences "everybody except you already switched to new kernel".

My personal summary is:

Currently provided info is not enough. If anybody wish to continue talks about magic kerner - more solid "proofs" required.
We can drop talks about concrete kernel and discuss api to pass any custom kernel (IMO this could be constructive alternative).
IMO, if this library is used to "reduce files before upload", computation savings will be not too significant.

@puzrin Fair enough. The Magic Kernel Sharp method goes back to 2013 and as noted on the page / in the report it is used server-side by Facebook and Instagram. I'm happy to provide some detailed help in the code if desired, if given some direct pointers to where you're implementing Lanczos. But understood if you want to wait until other apps are using it.

@j-p-c since diving into "image quality" questions require significant time, i'd like to avoid all kinds of coding & api changes "just for fun".

https://github.com/nodeca/pica/blob/master/lib/mm_resize/resize_filter_info.js - technically, it's not difficult to add one more function. But how to decide where to stop? People can ask to add 10 more kernels. Need criteria how to decide, what to accept/reject. Existing presets are from Skia (chromium) sources.

The most flexible way (not most convenient) - invent .filter option to pass any custom kernel, desired by user. Then users will not depend on my opinion, but will have to pass this option "manually", instead of selecting "quality".

With magic kernel, without my personal involvement, alternatives are:

Find trusted authority (persons). For example, ImageMagic authors created very detailed articles. If they say, "magic kerel is the best" - no problem.
Find trusted projects. This can be:
- lot of libraries, accepted this kernel (~ de-facto standard)
- big websites (like facebook), with detailed reports about kernel params, quality improvements and side effects.

If you have access to info about MK in Facebook, it would be very interesting to read. As i said before, it's not technical problem to add one more preset. Problem is to avoid uncertain things. I have no principal objections against magic kernel, but i can't grade it or spend my own time for investigations.

@puzrin Understood. Everything about Magic Kernel Sharp in Facebook and Instagram, other than the actual code, is on the page and in the research paper; I can provide any additional information or answer any questions if necessary. The paper has been read widely and to my knowledge has had no criticism or negative comments (the most significant comment was that the Magic Kernel part is itself a subset of cardinal B-splines).

But again, understood that it may not be in your best interests for your library to be on the leading edge on this.

But again, understood that it may not be in your best interests for your library to be on the leading edge on this.

I certainly prefer to avoid been on leading edge. But, i try to make libraries flexible, to not lock everyone on my personal opinion. Let's continue with details about MK, and i will see how to make it usable with pica in unobtrusive way.

General questions:

How long Facebook/Instagram use MK (how deep is it tested on big volumes)?
Do Facebook / Instagram use MK for all (most?) images or only partially (for experiments, AB testing, ...)
Did anyone investigated feedback about real images (at big volumes)? Sometime practice can diverge from theory.
Is FB team satisfied with result, or plan more improvement (what's next)?
What is MK future roadmap? Is it final (if not - when it will)?

About code landing:

What is the final equation and window size?
Is that "one kernel for all needs" or variations required (window size and so on)?

Thanks Vitaly.

I have provided public details previously on my page. Magic Kernel Sharp "2013" (I define this below) has been used server-side for 8 years for Facebook and 6 years for Instagram. Using only public numbers, that means that it's been tested trillions of times. It is used for server-side resizes. It is used in production, not experimental or testing. (The testing of it was originally done in 2013 on a corpus of millions of sample images before deployment.) It was deployed to solve problems with the previous system. Facebook has been satisfied with the result, and Instagram moved to using it in 2015. There are no plans to change it.

Magic Kernel Sharp as deployed at Facebook since 2013: The "Magic Kernel" (see here and here) (later pointed out to be a cardinal B-spline: it is the convolution of the window function with itself twice) is used for the resizing. A simple three-tap "Sharp" filter {−1/4, +3/2, −1/4} is applied in the smaller space (on the output, after resizing with the MK, if downsizing; or on the input, before resizing with the MK, if upsizing). The combination of the two operations is "Magic Kernel Sharp" (MKS) 2013.

When Instagram wanted to make use of MKS in 2015, we found that it needed a sharper output (more "pop") to match its existing algorithm. You can get a sharper result by just changing the Sharp filter to {−s, 4 + 2s, −s} / 4 with s > 1, so that the extra sharpening costs no extra cycles. From memory, a value of s of around 1.32 did the job.

Apart from the extra sharpening, there are no tunable parameters for MKS 2013.

The factorization of MKS into MK and S makes it efficient (and also allowed for hardware acceleration not obtainable any other way at the time). You can see that the window size of MK is only 1.5 (compared to 2 or 3 for Lanczos). The Sharp step is always in the smaller space, which makes it relatively cheap for large resize factors (up or down).

The easiest way to deploy MKS 2013 in your codebase would be to simply pre-convolve MK with S (as shown here and here) and drop that into your code as the resizing filter. But that has a window size of 2.5, which makes the resizing more expensive; overall, it is more costly except for resize factors fairly close to 1. If you didn't care about computational cost, this would be the easiest way to do it.

The best way to do it, if you care about cycles, is to implement it as MK and S as described above. That also allows you to provide extra sharpening for no extra cost (as Instagram uses). However, it would mean a little more change to your code than simply dropping a configuration entry into the existing set.

In all of the above I talk about "MKS 2013." That is all you need.

This year I also extended it to be more theoretically perfect, and proved why it was better than Lanczos, in a research paper. You don't need those better versions, and you don't need to read the paper. The Magic Kernel above is the third in a sequence of cardinal B-splines, the first two of which are nearest neighbor and (bi)linear. The antialiasing properties (without the flaws of Lanczos) are contained in the Magic Kernel. The Sharp step just flattens the frequency response. I showed how to extend the sequence to Magic Kernel 4, 5, and 6 (and in general to any member of the sequence), and corresponding approximations to the Sharp kernel, in the paper. These have even better antialiasing properties. They are useful for audio and other signal processing applications, but are not really needed for images (except perhaps for high-precision cases like 2D and 3D medical imaging).

Happy to answer any more comments, and review any diffs you might put up to implement MKS 2013.

Thank you for details. Very interesting.

Could you comment, why after Facebook used MK(S) 8+ years, this is not landed to other libraries (libvips and others)? https://github.com/libvips/libvips - i guess, authors will be happy to replace [pre-blur + lanczos3] with MKS.

When Instagram wanted to make use of MKS in 2015, we found that it needed a sharper output (more "pop") to match its existing algorithm. You can get a sharper result by just changing the Sharp filter to {−s, 4 + 2s, −s} / 4 with s > 1, so that the extra sharpening costs no extra cycles. From memory, a value of s of around 1.32 did the job.

Apart from the extra sharpening, there are no tunable parameters for MKS 2013.

Is extra sharpening useful in real world (for down-sizer), or Instagram case was only for compatibility reason?

About pica API

I think, MK(S) does not fit into .quality option. So, i think to add .filter option:

{ fn: ..., win: ... } - any custom filter. Internally converted to interpolating Array[100], to safely been passed to webworkers (as simple objects).
mks2013 | box | hamming| lanczos2 | lankzos3 (String) - preset name, to simplify use

I suppose, passing s as param of MKS will make api too cluttered while "added value" will be not significant. If user wish extra sharpening, existing sharpen options can be used on second call.

Hey Vitaly, I didn't seek to advertise MKS, but provided more information about it on my page (which had the current MK on there since 2011, and an earlier doubling / halving version since 2006) after I figured out theoretically why it all worked so well. It was only this year when I fully figured out it out, hence the research paper. That has gotten more attention. People have made use of MK and MKS over the years, but I have not pushed for its use elsewhere.

My good friend and colleague @vjeux opened this request. He was the person who provided the introduction in 2013 that resulted in me implementing MKS in the first place.

The extra sharpening was primarily for compatibility, but it does provide more "pop" to downsized images, and costs no extra cycles. It definitely can be relegated to a separate filtering, at the expense of cycles.

My super-power is to find people that are doing some amazing work like the both of you and to connect so you can make things better. MKS has been super impactful at Facebook and I feel like it would be awesome to have it in more places.

For context, as part of excalidraw, a whiteboarding tool that I am working on right now, we're adding image support and using pica in order to do image resizing and I figured it'd be great to have MKS support there :)

Does anyone use this lib with non-default (best) .quality option? Probably, worth to drop it completely (after .filter added).

Could you test mks2013 in dev branch https://github.com/nodeca/pica/commits/dev (see last 3 commits)?

Just open /demo/index.html in browser.

I see some difference, but sharpening is not significant. Don't know, is that right or not.

I think it is correct. The kernel formula in code looks correct. I have compared a resized image against my reference implementation: there are small differences, but I don't believe you are handling gamma encoding (i.e. you are operating on the gamma-encoded values, rather than converting to linear space), which could explain those differences. I can try to remove the gamma handling of my reference implementation in order to do a direct comparison.

I have added a switch to my reference implementation to turn off the correct handling of gamma encoding. With that switched off, the results are essentially identical to what pica is showing (with all unsharp turned off).

I do notice that, with WebWorker enabled, the diff between the images, when magnified, shows some noticeable vertical and horizontal lines (but not noticeable in the raw image, to the naked eye). These lines disappear when WebWorker is disabled. I assume that there is some sort of batching into rectangles going on with maybe some sort of simplification going on around the edges of those rectangles, rather than correctly using the information in neighboring patches?

In any case, that's a characteristic of pica itself, not related to the implementation of mks2013, which looks to be correct.

I have added a switch to my reference implementation to turn off the correct handling of gamma encoding. With that switched off, the results are essentially identical to what pica is showing (with all unsharp turned off).

Thank you!

In JS there are no access to gamma, and canvas color channels are 8 bits only. That's why readme has note about "pica not recommended for professional quality photos". The only way to improve things is to write jpeg decoder and so on, but this is out of scope.

The only possible to improve thing is keep 16-32 bit precision between vertical and horizontal passes. But i had no time for experiments.

I do notice that, with WebWorker enabled, the diff between the images, when magnified, shows some noticeable vertical and horizontal lines (but not noticeable in the raw image, to the naked eye). These lines disappear when WebWorker is disabled. I assume that there is some sort of batching into rectangles going on with maybe some sort of simplification going on around the edges of those rectangles, rather than correctly using the information in neighboring patches?

That's strange. When create tiles, pica use extra overlapping borders to avoid edge effects. And generated convolve filters use float offsets (no rounding). There should be no distortion.

The only hack was for safary

pica/index.js

Line 354 in 384784f

if (NEED_SAFARI_FIX) {

. That should not be related to webworkers.

Try to create new issue, how to reproduce problem and try v6. May be we broken something in v7, when tried to extract data more effectively (to not block main thread).

For gamma, a good default is to assume sRGB, do a lookup to undo gamma encoding to linear space, perform the resizing, and then do another lookup to gamma-encode again. But as you say, with only 8 bits that's not ideal, and to be fair many applications don't resize in linear space anyway. It's reasonable to leave it as-is.

I only came across the patches when looking at the diff (which otherwise was a close match). I'll try to find some time to explore further, and open a new issue. The diff using mks2013 between pica worker and pica non-worker (with mid-gray = 0, differences magnified by a factor of 10) is shown here. The lines are at x = 107, 108, 209, 210, 319, 320, and y = 103, 104, 209, 210. I don't fully understand those values, but I haven't looked at how the space is divided up in your code. Diffing with the reference implementation image shows that it's the worker one that has the line artifacts (turning that off leaves no lines).

Your image shows borders of tiles. This may be browser-specific problem.

AFAIK, pica uses the same tiling process (split source to slightly overlapping areas ~ 1000x1000) both with and without workers. The only difference in v7, it may pass region extraction to worker, but that's not about math.

That's why i suggested you to try v6 for sure. Because in v6 workers do pure math, without browser-specific things (extracting raw bitmaps from images).

Yes, you're right: it is only buggy on Chrome [Version 96.0.4664.55 (Official Build) (x86_64) for Mac] but not on Safari [Version 15.1 (17612.2.9.1.20)] or Firefox [94.0.1 (64-bit)]. I will need to try on v6 as well.

An easy way to check it is with this test image, which directly shows vertical artifacts on Chrome in WebWorker mode (for all resizing methods), but not otherwise.

(BTW: A different example image showing why MKS2013 is superior to Lanczos is here.)

Confirmed: Chrome is also buggy on master but not back on v6.

Thank you for your help & test fixtures. As far as i remember, Chrome is currently the only browser with OffscreenCanvas support. That's why other browsers have no such defect.

pica/index.js

Lines 272 to 284 in 384784f

    
           if (this.features.ww && CAN_USE_OFFSCREEN_CANVAS && 
        
               // createImageBitmap doesn't work for images (Image, ImageBitmap) with Exif orientation in Chrome, 
        
               // can use canvas because canvas doesn't have orientation; 
        
               // see https://bugs.chromium.org/p/chromium/issues/detail?id=1220671 
        
               (utils.isCanvas(from) || CAN_USE_CIB_REGION_FOR_IMAGE)) { 
        
             this.debug('Create tile for OffscreenCanvas'); 
        
             return createImageBitmap(stageEnv.srcImageBitmap || from, tile.x, tile.y, tile.width, tile.height) 
        
               .then(bitmap => { 
        
                 extractTo.srcBitmap = bitmap; 
        
                 return extractTo; 
        
               }); 
        
           }

Removing this code block (may be in /dist/...) should disable passing complex objects to Webworker, and force use typed arrays. Disadvantage is blocking main thread while decode images (when draw image region [tile] to canvas).

FYI, this is webworker wrapper https://github.com/nodeca/pica/blob/master/lib/worker.js. It normalizes ImageData to typed array, if receives. As you can see, no more difference with v6.

In fact, fighting with browser workarounds takes more efforts than implementing math :)

Confirming. After commented out block with OffscreenCanvas use (passing ImageData to WW, instead of typed array), defect in Chrome removed.

Seems browser bug. Need to investigate if workaround possible.

#223

Created separate issue, and found better workaround. We still can use OffscreenCanvas to decode image in WW, but result should stay typed array. Then no defect.

Nice!

In fact, fighting with browser workarounds takes more efforts than implementing math :)

I had a similar experience with my 2013 implementation with hardware acceleration. Working around the bugs is always tedious!

	if (this.features.ww && CAN_USE_OFFSCREEN_CANVAS &&
	// createImageBitmap doesn't work for images (Image, ImageBitmap) with Exif orientation in Chrome,
	// can use canvas because canvas doesn't have orientation;
	// see https://bugs.chromium.org/p/chromium/issues/detail?id=1220671
	(utils.isCanvas(from) \|\| CAN_USE_CIB_REGION_FOR_IMAGE)) {
	this.debug('Create tile for OffscreenCanvas');

	return createImageBitmap(stageEnv.srcImageBitmap \|\| from, tile.x, tile.y, tile.width, tile.height)
	.then(bitmap => {
	extractTo.srcBitmap = bitmap;
	return extractTo;
	});
	}

Magic Kernel support