intel / bmap-tools

BMAP Tools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support creating bmap files for compressed image files

andrewshadura opened this issue · comments

At the moment, when the image is compressed (e.g. with gzip), bmaptool generates a bmap file for the compressed file, not for the original image. Please add (possibly optional) way to find holes in the compressed original data.

Thanks.

Hi, only if this was possible. See, the idea is to create bmap file right after the image is generated, when the holes are still there. As soon as you copy your file with a traditional "dumb" tool, or gzip your file, the holes (unmapped areas) turn into zeroes (mapped areas). So you have lost them.

So the assumed usage model for this tool is:

  1. You generate your image in a smart way, starting with a "fully unmapped" file.
  2. Once all the data is put to the image, you have many holes there (very often).
  3. You use bmaptool to save the information about the holes into a separate bmap file.
  4. You do the rest of the stuff with your image - compress, publish at an http server, etc.

I think it should be possible to find some holes the same way cp does it (maybe just pipe through cp --sparse=always?). There should be a guarantee, however, that the holes give zeros when read after writing the image… achievable on sparse files only, I understand, but unlikely on real block devices.

Does gzip not store any metainformation to help distinguish holes from legitimate zeroes?

For archiver like gzip or any other, set of zeros produced by kernel in unmapped blocks or in mapped area but filled with zeros are the same. None or archivers we know about at the moment can really save sparseness of the image. Finding holes like it's done by applications just based that block of data filled with zeros is not possible for scenarios where bmaptool supposed to be used: legitimate mapped block of zeros must be written to disk with zeros. If there is no information which block is really mapped and which one is not - it means all data must be written.

OK, so discussion forked to the cp area, which is worth discussing too. But regarding gzip - no, I believe it does not. I did not deliberately verify this, but very sure.

Regarding cp... Modern "cp" on Linux may preserve holes, I did not try lately, but this depends on the file-system. E.g., VFAT does not have this concept. The --sparse=always option is harmful (in our case, not generally). See, we are not talking about just a file, this is a disk image. It is supposed to be deployed to a real device. The --sparse=always option is about disk space optiomization: turns long blocks of zeroes into holes. Holes take virtually no disk space.

Now, can you use '--sparse=always' for an image? No. Imagine that inside the image there is a file which contains zeroes. When we flash the image to the target device and run the device, we expect to see zeroes in that file. If bmaptool treats those blocks of zeroes in the image file as a hole, it'll skip them at the time of writing the image to the target device. And the file will end up containing whatever garbage was in those disk/flash blocks, but not zeroes.

Let me know if I failed to clearly explain this and I will try again.

Sounds to me like this is possibly outside the scope of bmaptool ?

I think in general this kind of feature may be interesting, just matter of proper implementation, good integration, may be documenting that too.

Lets keep this feature request open for some time. If there are no takers, we'll close it and move to a "WishList.txt" file in the project itself.