mozilla / mozjpeg

Improved JPEG encoder.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About using both libjpeg-turbo and mozjpeg

Jehan opened this issue Β· comments

commented

Hi! I'm a GIMP developer and mozjpeg came to my attention so I read a bit your various sample, the libjpeg.txt and some blog post.

So if I understood, mozjpeg is targetting web usage and libjpeg-turbo might still be better on another usage. So if I wanted to implement this in GIMP, I would like it to make it an option (like an encoder choice dropdown list), even more as mozjpeg is not available on most distributions anyway (so it can only be an option). The problem is that you implemented it as a drop-in library, which is great if you just want to replace existing encoder/decoder without touching the code, but not if you want to use both, as it means symbol clashes, same lib names, etc.

How would you use both of them in the same binary? Would you try dynamic loading (even this is annoying as you have to play with lib prefixes, etc. because both libs use the same names)?

I mean, even before this, how would you just detect if mozjpeg is available, you only provide a libjpeg.pc and libturbojpeg.pc (so same pkg-config files as the mother project?).

Am I missing something for people who want to use both libraries as they are quite complementary AFAIU.

If not, I would suggest to create a libmozjpeg with its own libmozjpeg.(so|a|pc|h) files. You could still provide a standard libjpeg (which would basically use the libmozjpeg functions but with standard jpeg_ names) on the side.

This would also allow distributions to actually ship mozjpeg without having to decide to drop libjpeg-turbo (they could then only ship the libmozjpeg), and applications to load both mozjpeg (with namespaced mozjpeg_ functions) and another jpeg lib.

MozJPEG has a "revert" option that makes it behave exactly like libjpeg-turbo. So I suggest taking advantage of that drop-in ability.

  • Check if mozjpeg is available, and use it as the only libjpeg back-end if it is.
  • Where you want to keep the libjpeg-turbo behavior, #ifdef it to behave like libjpeg-turbo:
jpeg_c_set_int_param(cinfo, JINT_COMPRESS_PROFILE, JCP_FASTEST);
jpeg_set_defaults(cinfo);
commented

Check if mozjpeg is available, and use it as the only libjpeg back-end if it is.

How do you do this since everything has the same name? Is there a function to know at runtime if mozjpeg is the actual backend? (runtime check is nice too as it allows such a feature to appear if you switch the backend lib without recompilation)

Or is there a build-time check? For instance, I don't see any pkg-config variable or anything of the sort to distinguish between a mozjpeg libjpeg or another.

Where you want to keep the libjpeg-turbo behavior, #ifdef it to behave like libjpeg-turbo:

Are these 2 calls to make it behave like libjpeg-turbo or mozjpeg? (I ask because the first part of the sentence seems to imply libjpeg-turbo, but the #ifdef part seems to imply mozjpeg)
Edit: Ah I just grep-ed the headers for the turbo and mozjpeg implementations. Actually JINT_COMPRESS_PROFILE/JCP_FASTEST don't exist on jpeg-turbo headers. So that answers my question (you were actually proposing a #ifdef HAVE_MOZJPEG), and that also means that there is no runtime-check possible (as you couldn't build this code with jpeg-turbo so it has to be chosen build-time). πŸ˜•

This all being said, this solution is useful for projects which want to embed mozjpeg for instance. But typically if mozjpeg is nice, it's sad that it is not available in Linux distributions. But I don't think it will ever be unless you "kill" libjpeg-turbo (which I assume you don't want to do as it's your upstream too). Basically mozjpeg places itself as competition to its upstream over the standard jpeg API so only one of them can be packaged. It would be much better if mozjpeg placed itself as a complementary API IMO.

For instance, I personally believe that APNG failed as a format because they modified libpng. Hence the only way to package lib(a)png in distributions would have been to kill libpng out of said distributions and replace it by the APNG fork (same lib libpng, same API, same thing as you do with mozjpeg), which obviously never happened and likely never will. As a consequence, APNG was never available as optional format on most Free Software (this is the reason we never implemented it in GIMP).

I don't know what check to recommend, because I don't know what setup you have.

You could vendor mozjpeg, and then you'll know for sure it's there. I actually think that's the best option, but distro-unbundlers may disagree :)

You could try pkg-config with names under which some distros have shipped mozjpeg.

If you only have some libjpeg.so to work with and you want to guess whether it's mozjpeg, that's tougher. Currently you can check if it has jpeg_c_set_int_param symbol, although libjpeg-turbo planned to have it too. But libjpeg has different ABIs, so you'd probably want to peek into headers anyway.

If you can do compile-check, then checking whether JBOOLEAN_TRELLIS_QUANT works is probably closest you can get to true feature detection.

commented

I don't know what check to recommend, because I don't know what setup you have.

Personally I run Fedora, and our reference distribution for GIMP development is Debian. But this is irrelevant here. It's about sane library policy. Basically if your library clashes with standard libjpeg, then it can never be in common Linux or BSD distributions; even more as you target a specific web workflow use case, so it is not a full replacement of more generic libjpeg as far as I understand (even though you can revert to generic behavior, as you note, but then devs have to know they are on mozjpeg! Just this means that actually the drop-in concept is wrong, mozjpeg is not really a drop-in replacement, it only looks like one πŸ˜›). I can see for instance that Debian wanted to package mozjpeg at some point, then dropped the idea for this reason:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741487

So the question is really not about my setup. I don't want to use mozjpeg for a personal project. I am interested for it to be usable more widely. πŸ™‚

You could vendor mozjpeg, and then you'll know for sure it's there. I actually think that's the best option, but distro-unbundlers may disagree :)

"vendor" == bundle, right? Then no, we don't want to bundle libs. This is bad practice. A library has to be properly named, and detectable (mozjpeg.pc for pkg-config) and with its own namespace in order not to clash with other libraries. That's good practice! πŸ˜›

You could try pkg-config with names under which some distros have shipped mozjpeg.

Which distros have packaged it? Anyway, we cannot rely on third-party packager renaming (we are not going to special-case for each distribution. Again… bad practice!). The renaming should come from the upstream project.

Say you had contributions to add a proper namespace to mozjpeg (i.e. mozjpeg_set_defaults() and so on) while still also allowing a standard libjpeg version (i.e. that people could build the libjpeg, the libmozjpeg or both). Would you accept it?

I'm not saying I'm going to do this, I already have a lot to do on my own, but at least knowing if it were ok would be useful. Then who knows what happens in the future!

That default-on behavior is supposed to be a feature, so that software doesn't have to add MozJPEG support. You can just make it use MozJPEG instead of standard libjpeg and get better compression automagically.

The readme is technically-correct that it's focused on Web workflows, but in practice I see no reason not to use it for everything. Why would you want to have larger/lower-quality images if you don't have to? :) There is increased encoding cost relative to vanilla libjpeg, but encoding is still super fast in absolute terms (e.g. encoding may take 80 milliseconds instead of 40ms). It's not like AVIF where you need a progress bar and an apology for the time it takes.

libjpeg has a ton of symbols, and doesn't have a symbol-prefixing feature. Do you have some clever idea how to add prefixing? I wouldn't want something that litters the code with macros, which would create merge conflicts when pulling changes from libjpeg-turbo.

I don't get what distros want from mozjpeg. I'd expect them to ship it either as an optional package that directly replaces libjpeg globally (with appropriate conflicts/provides metadata in the package manager), or put mozjpeg as libmozjpeg.so/mozjpeg.pc so that applications can detect it and pull in if they want to.

If distros wanted to use MozJPEG as the libjpeg shipped by default, I can add a compile-time option to have enhancements disabled by default for careful backwards-compatibility. But I don't expect distros to trust me with such a key package :)

Someone packaged mozjpeg for RedHat/Centos 8. He added a moz prefix on all utilities.

commented

@zeroheure

Someone packaged mozjpeg for RedHat/Centos 8. He added a moz prefix on all utilities.

Are you talking about this package? I saw this one too but it's third-party. It doesn't count or rather actually shows the issue. Since distributions can't package mozjpeg, third party end up doing custom packages with weird tricks. That's not good. On GIMP for instance, we cannot depend on packages not in distributions (we did it several times actually, but only because we worked with upstream to actually make their library package-able! We work on making things generic and standard).

The readme is technically-correct that it's focused on Web workflows, but in practice I see no reason not to use it for everything. Why would you want to have larger/lower-quality images if you don't have to? :)

There is also image quality (even though for quality, people should maybe choose something better than JPEG at all, still…). I am not saying that mozjpeg produces worse jpeg (for same size, I mean), I just have no idea because all the posts I read just focus on file size and compression speed. Is the image supposed to be exactly the same? There is only the compression algorithm which changes?
Also if really we have better compression (i.e. smaller size) for a same quality image, does it apply to all kinds of images? Detailed images as well as abstract ones, drawing, painting, photographs, design, etc. Does it always perform better? This is a real question, there are a lot of algorithms which perform better on specific cases (this doesn't diminish the interest for the algorithm, you only have to be clear or to test). That's why for instance we provide options for such things (for instance, a list of interpolation algorithms for each transform tool, etc. because depending on what you transform, some interpolation algorithm are better fitted).

In any case, as someone exterior from the project, I am not jumping on a fancy new train just because it's new. libjpeg-turbo also proved its worth for years. But I'm more than happy to provide it as an option. πŸ˜›

There is increased encoding cost relative to vanilla libjpeg, but encoding is still super fast in absolute terms (e.g. encoding may take 80 milliseconds instead of 40ms).

Ahah you talk like a developer who is only used to test cases! πŸ˜› (just a joke, we all do this, I do sometimes!)

Here I just tested on a 8192Γ—8192 image (not uncommon size range at all), it took me 3.5 seconds with jpeg-turbo (not including all pre-processing we do, just the export part) on my reasonably powerful laptop. Then I tested with mozjpeg (exactly same code, just swapped the libjpeg lib). 13.5 seconds! So about 4 times slower (and I guess again it may depend on files, libjpeg-turbo devs says it can go up to 50Γ— slower apparently).

There are also much higher possible sizes. One of our long-time contributors is regularly working on 40kΓ—40k scan images. Some people come to us with 60,000Γ—60,000 images sometimes.

And by the way, slow import or export is a regular type of issue which is reported to us (not too often for JPEG, so we'd prefer to keep it this way πŸ˜›). Here for instance, one we got recently: someone who exports a huge image to PNG in 45 minutes! Not even going to such extremes, when people have to wait 10 seconds for an action, they think it's slow. Of course when there is no choice, there is no choice and all we can say is "too bad, your image is huge, what can we do?" But here there is a choice (-turbo). Not everything is about file size. πŸ˜‰

I don't get what distros want from mozjpeg. I'd expect them to ship it either as an optional package that directly replaces libjpeg globally

As you say yourself, libjpeg is quite a key package. If something goes wrong, it's not just one software which is broken. You may imagine how software and distributions are not going to just jump ship without a reason (i.e. unless libjpeg-turbo suddenly goes unmaintained or there is a political issue around the project governance, or really suddenly mozjpeg outperforms libjpeg-turbo on all points and gets all the contributions, or whatnot).

Creating conflicting packages is something distributions sometimes do, but I think they prefer to abstain from it, and I agree with it.

Last reason to me, and I think that's a big one: if mozjpeg were to get used enough that it actually starts to replace libjpeg-turbo, then, as said before, it may kill libjpeg-turbo development, which is even more annoying that it's your own upstream. I don't think you want it to go unmaintained either. In such a context, it would have really been worth it to make mozjpeg as a complementary project rather than as a replacement.

Anyway let's go now with constructive ideas! Cf. below. πŸ˜ƒ

libjpeg has a ton of symbols, and doesn't have a symbol-prefixing feature. Do you have some clever idea how to add prefixing? I wouldn't want something that litters the code with macros, which would create merge conflicts when pulling changes from libjpeg-turbo.

Yeah ok so that was something I wondered (how exactly the turbo-mozjpeg relationship works). So from what you say, you implemented mozjpeg as a proper fork of the whole code, hence indeed you don't want to diverge too much all over to be able to regularly rebase from upstream.

So I see 2 ways to do this:

  1. Either you go with a wrapper mozjpeg.[ch] which just wraps the public functions of your custom libjpeg (which would be linked statistically so that packagers are able to just use the libmozjpeg without removing libjpeg-turbo) and make this wrapper into a libmozjpeg.so. So all the code change is in the wrapper (files which don't exist in libjpeg-turbo), which means it won't ever create any merge conflicts. The advantage of this approach is that developers will be able to load both libjpeg-turbo and mozjpeg (even though you find it redundant, I personally feel safer with the mainstream library, because it means more eyes looked at it… at least until yours becomes the new mainstream, maybe someday).
    Seriously, this could even be implemented by automatic generation of the wrappers (both header and implementation, as we are just mapping one-on-one every jpeg_ function into exactly the same mozjpeg_ function). Then apart from the generation code done once, it's very minimal maintenance later on.
  2. Do what third party packagers apparently already do and just rename the external library, yet keep the jpeg_ namespacing. I.e. have a libmozjpeg.so and libmozjpeg.pc and so on (same as option 1. but only externally, the symbols stay the same and no wrapper code). This would avoid name clash at system level at least so it can be installed side by side with the turbo one. Now the disadvantage is that it won't be possible to link to both libjpeg because of symbol clashes (well you could do dynamic loading if you really want, a bit less practical). At least it would be available and software will be welcome to search for this alternative jpeg API implementation if they want.

Since distributions can't package mozjpeg

Why they can't? I see no problem with providing it as libjpeg.so with "provides: libjpeg, conflicts: libjpeg-turbo" type of metadata.

It should also be possible to either rename .so/.h or install them in some non-default location where applications would specifically look for it.

exports a huge image to PNG in 45 minutes

This is not a concern in Mozjpeg's case. The overhead is relatively constant. Deflate cost can vary by orders of magnitude, but MozJPEG's cost is mostly linear (comparable to encoding the image twice).

Also note that early comparisons between libjpeg-turbo and MozJPEG were focused on the fact that MozJPEG uses progressive JPEG, and libjpeg-turbo didn't have full SIMD optimizations for progressive variant. So it wasn't just turbo vs MozJPEG, but turbo's fast path vs turbo's slow path. Turbo's poor handling of progressive JPEG has been fixed a while ago, and MozJPEG benefited from these perf improvements too.

There is also image quality

I think you misunderstood. The point of MozJPEG is to improve quality/filesize ratio. It's a win-win: you get better quality for the same file size, or better file size for the same quality, or both. There is no downside in either quality or file size. MozJPEG tunes for these two aspects over speed. libjpeg-turbo's maintainer values speed over the other two variables.

MozJPEG has a few techniques. Improved splitting of progressive scans gives smaller file size while being 100% visually identical with libjpeg-turbo.

But MozJPEG also has trellis quantization and tuned quantization tables that give better visual quality, but on a microscopic scale they make different choices than libjpeg-turbo, so some pixels differ. The differences are relatively small and predictable, so there's no risk of unexpectedly ruining an image (especially that on average, you get better quality).

It's on par with differences in JPEG encoding you get between different camera brands, or Photoshop vs libjpeg, or libjpeg v6 vs libjpeg v9.

it may kill libjpeg-turbo development

I don't see that threat from MozJPEG at all. This is not a competing product, this is a patch. 2/3rds of MozJPEG's improvements are just in improved default configuration of the exact same library. It used to be a perl script that passed tuned parameters to vanilla jpegtran. libjpeg-turbo is free to merge these changes if they ever become must-haves for the project.

It would be really silly to have both libjpeg-turbo and mozjpeg in the same binary, because they're 99% identical. If you configure mozjpeg with fastest defaults, it differs from libjpeg-turbo by a single if statement.

Mozjpeg passes libjpeg-turbo's test suite, including file checksums. In turbo mode it creates bit-identical files. It is the turbo library.

commented

libjpeg-turbo is free to merge these changes if they ever become must-haves for the project.

It would be ideal of course. πŸ™‚

install them in some non-default location where applications would specifically look for it.

This is a bad idea. Ok for a personal script not for a proper lib. Distribution maintainers shouldn't (wouldn't) do this.

It should also be possible to either rename .so/.h

If the distrib does it, it's non-generic. Every distrib will do its own thing and software won't know what to search for (except for special-casing depending on distribution, which is not ok). If you could just do this renaming upstream (my proposition 2.), then no distribution has to do the renaming, and we know the one and only name to search for, whatever your distribution.

So bottom line, forgot all the more complicated wrapper or namespacing proposals I made. Just renaming mozjpeg lib would be already good (not the distrib, you, the upstream). πŸ™‚

s/libjpeg.so/libmozjpeg.so/
s/libjpeg.pc/mozjpeg.pc/

Don't rename the header files though (otherwise it's not drop-in), these can go under a subdirectory, like $prefix/mozjpeg/jpeglib.h. This is not a problem because the .pc file will tell where the headers are (that's their job).

You do this, and I predict mozjpeg will start being packaged in distribution and used by GIMP in the next few months. πŸ‘

@Jehan one point you might have missed is that mozjpeg compete also with Webp for fixed images without transparency. After testing on some of our shop photos mozjpeg is often better on size and quality.

commented

@zeroheure This is a bit irrelevant to the discussion which is to allow mozjpeg to be packaged (with a standard name) on distributions and being used by programs such as GIMP.

I am not here to compare the formats (not even here to compare mozjpeg and jpeg-turbo). πŸ™‚

@Jehan I know, but this point help to understand why it is a drop in replacement : update the web toolchain, and that's all..

commented

@zeroheure Still irrelevant. πŸ˜› My patch #383 still allows the lib to be "dropped in". Anyway when you bundle applications with their dependencies (which is the only possibility so far without actually patching in each distribution, hence incompatibly, removing the whole point of packaging), you can do whatever you want. Rename or whatever, nobody cares, because the dep is for you only.

But when you want to package properly, you need to follow some kind of standards.

Basically with #383, we can still do everything we could do before, but now distributions can finally also package mozjpeg, and now we have a proper standardized way to detect and use it in GIMP. Without this, we won't have the feature in GIMP (maybe as a third-party plug-in which bundles the lib, but never upstream). On the other hand, if this is merged, within a week, it will end up in GIMP as an option. πŸ™‚

@Jehan. Ok. I was trying to understand the drop-in choice for big web companies. Obviously, mozjpeg adoption by small teams and individuals is the graal to achieve and can't be done without distro packaging it, thus it should not be a drop in..