AcademySoftwareFoundation / OpenImageIO

Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.

Home Page:https://openimageio.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] OIIO::ImageBuf::nsubimages returns zero for existing image

zavinator opened this issue · comments

Describe the bug

OIIO::ImageBuf::nsubimages returns zero for existing image:

OIIO::ImageBuf buf("existing_file.jpg"); // some file (tested JPG and TIFF)
int nsubimages = buf.nsubimages(); 
// -> 0 (incorrect) for OIIO 2.5.10.1
// -> 1 (correct) for OIIO 2.4.17.0, 2.3.15.0

OpenImageIO version and dependencies

Incorrect:

OIIO 2.5.10.1 | unknown arch?
Build compiler: MSVS 1939 | C++14/199711
HW features enabled at build: sse2
Dependencies: , Boost 1.84.0, BZip2 1.0.8, FFmpeg NONE, Freetype 2.13.2, GIF 5.2.2, JPEG 62, Libheif NONE,
libjpeg-turbo 3.0.2, LibRaw NONE, OpenColorIO NONE, OpenCV NONE, OpenEXR 3.2.3, OpenJPEG 2.5, PNG 1.6.43, Ptex
NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.6.0, WebP NONE, ZLIB 1.3.1

Correct:

OIIO 2.4.17.0, same dependencies

We changed the meaning of this recently for 0 to mean "we can't tell without trying to read the whole file." Some file formats can know as soon as you open the file, and others need to read subimage by subimage at considerable expense in order to determine the total. We wanted to avoid this by having a "don't know, you'll just have to try" answer.

But this is flawed in two ways:

  1. Although we documented this in the "oiio:subimages" attribute retrieval in the standard metadata chapter of the docs, I neglected to also explain it for ImageBuf::nsubimages(), which is basically retrieving the same data and so will have the same property.
  2. Of course, some formats like JPEG that are not capable of storing multiple images ought to return 1 because it's not the case that we aren't sure how many there are.

I will address these with a patch. Sorry for the inconvenience.

I've tested also the TIFF with multiple images.
So how can I get the number of images -is it in the new version possible via ImageBuf?
Or this behaviour is goeing to change in future release?

TIFF is one of the formats where it's not going to be possible. There isn't a way to know when you open a tiff file; you have to read image by image in the file and see when you get to the end.

The question is, what do we want to have happen when you call IB::nsubimages() on such a file? Should it return 0 meaning "no inexpensive way to know, you'll have to seek one by one and see where it stops succeeding?" Or should the call itself be considered a request to do the expensive thing?

Here's the history of how we got here:

We used to have every ImageBuf use ImageCache underneath, at least to get the header info, and ImageCache did the expensive thing of fully inventorying the header/spec of every image in the file up front (reading the pixel data was more lazy). But this use of the IC came at a cost, and made IBs more expensive than necessary to read an image. You want IC when your IBs are so big that you can't hold all the images in memory, but when things fit in memory (which is usually), it's just slower than if you'd read directly.

So the change we made is that now IB only uses an IC underneath when specifically requested. Otherwise, it just does a direct read instead of going through the cache. I think that matches people's expectations (for both behavior and performance) better, and they can still be IC-backed for advanced users who request it.

But here's what we lost along the way -- we no longer have that up-front read that inventoried all the subimages (including knowing exactly how many there were). For some files, like OpenEXR, we can know as soon as we open the fiel. For other files, like TIFF, you just have to read them one by one, even to find out the count.

So here is my plan:

  • For image formats that don't support subimages, we will report 1 (the only possible value) instead of 0 (don't know). That will fix something like a jpeg file stupidly reporting 0.
  • For image formats that can know up front easily (like OpenEXR, I think), we can (and maybe already do?) report the correct number without extra expense.
  • For image formats that support multiple subimages but can't tell how many without reading the whole file (like TIFF), we have three choices:
    1. Continue to report 0 "we don't know" if it's expensive to find out the total.
    2. Interpret a call to nsubimages() as a request that should be fulfilled, even if it's expensive.
    3. Make a second call -- so we have one method that returns the value if inexpensive or 0/"don't know" if it's hard, and a second method that always returns the right result even if it's expensive.

Everybody, please let me know your preferences.

Perhaps I can utilize this approach instead of calling the nsubimages function.

int i = 0;
while(img.read(i))
{
  // do something with img
  ++i;
}

I'm reading all the layers, so I have no performance gain for i, but maybe someone can benefit from it...
By the way, this is exactly why I dislike updating libraries; there's always a chance something might break. :)

Yes, that is precisely what you have to do. Well, almost! Instead of read(), you can call init_spec(), which will only read the header and not the pixels, and so is less expensive than read().

That's what we used to do up front, but it's not an extra expense most people usually want.

I've just tested this approach (OIIO 2.4), but it seems that both functions init_spec() and read() still returns true for image index greater then nsubimages. So it hangs in infinite loop.

OIIO::ImageBuf buf(filename);
int i = 0;
while (buf.init_spec(buf.name(), i, 0))
{
	if (!buf.read(i, 0)) break;
	assert(i < buf.nsubimages()); // OIIO 2.4 test
	++i;
}

OK, I've got something so fix, I guess!

Improvements proposed in #4228