Fix for BC1 encode errors with random colors in constant blocks

Question

Fix for BC1 encode errors with random colors in constant blocks

alecazam opened this issue 4 years ago · comments

In kramv, I kept seeing BC1 errors with a texture that I've linked to. It's a color texture with mostly grayscale in it. And green, yellow, and purple blocks would appear in the image where there were none originally. With the block view, I could tell these were erroneous constant blocks.

I think I might have found the issue. By decrementing the min16--, if the last color value is 0, then the entire color rolls over and can set high red or blue colors to high values. That's fine provided that color isn't the one picked from the selectors, but the selectors look like they're reversed and picking the modified color (or the non-zero color). I also added some 'else' statements, and consolidated the flow is a little clearer.

I didn't yet check the other encoders to see if this same logic was repeated.

https://github.com/alecazam/kram/blob/main/tests/src/Toof-a.png
kram encode -v -f bc1 -srgb -i ~/kram/tests/src/Toof-a.png -o ~/kram/tests/out/mac/Toof-fixbc1-a.ktx

void encode_bc1_solid_block(void* pDst, uint32_t fr, uint32_t fg, uint32_t fb, bool allow_3color) 
{
		... I have allow_3color disabled, so it goes directly into this code

if (max16 == -1)
{
    // 565 endpoints, these should be same color
    max16 = (g_bc1_match5_equals_1[fr].m_hi << 11) | (g_bc1_match6_equals_1[fg].m_hi << 5) | g_bc1_match5_equals_1[fb].m_hi;
    min16 = (g_bc1_match5_equals_1[fr].m_lo << 11) | (g_bc1_match6_equals_1[fg].m_lo << 5) | g_bc1_match5_equals_1[fb].m_lo;

    if (min16 == max16)
    {
        // Always forbid 3 color blocks
        // This is to guarantee that BC3 blocks never use punchthrough alpha (3 color) mode, which isn't supported on some (all?) GPU's.
    
        // Make l > h
        if (min16 > 0)
        {
            mask = 0x55; // b01 x 4, choose max = color, <- was b00 x 4 which picks modified min 
            
            min16--; // this can raise channels if b is 0, but mask ignores min color
        }
        else
        {
            mask = 0; // b00 x 4, choose min = 0,  <- was 0x55 b01 x 4 which picks modified max16 = 1 which isn't original min16=max16=0 
            
            // l = h = 0
            assert(min16 == max16 && max16 == 0);

            // here we know both colors are 0, so make max16 bigger
            max16 = 1;
            min16 = 0;
        }

        assert(max16 > min16);
    }
    else if (min16 > max16)
    {
        std::swap(max16, min16);
        mask ^= 0x55; // b01 x 4, choose max
    }

....
}

Alec Miller · Answer 1 · Mon Jan 25 2021 05:21:18 GMT+0800 (China Standard Time)

Note, this doesn't quite satisfy me as the solution, since the blocks shift around the image, and the blocks were on a grayscale image, so that bottom channel is never zero. I still think the selectors are reversed, but I don't quite understand the logic above on a constant block for why min16 and max aren't always the same color. There's a mask in there of 0xAA interpolation b10 x4, then 0x55 and 0x00 are the min/max. Might be better for RDO to always use the 0 selector, and munge the max16 value.

Rich Geldreich · Answer 2 · Mon Jan 25 2021 05:46:08 GMT+0800 (China Standard Time)

Hi,
Thanks for the report. I'm trying to reproduce this with bc7enc.exe, but I've been unable so far. I tried a bunch of options. Can you show me how to repro it using just the command line tool? Thanks!

Example:
bc7enc -c -1 J:\dev\test_images\Toof-a.png -o

J:\dev\bc7enc-master\Release>bc7enc -c -1 J:\dev\test_images\Toof-a.png -o
Compressing to BC1
Source image: J:\dev\test_images\Toof-a.png 535x609
Level: 2, use 3-color mode: 0, use 3-color mode for black: 0, bc1_mode: 0
..
Total time: 0.002000 secs
Wrote DDS file Toof-a.dds
Luma Max error: 44 RMSE: 2.942577 PSNR 38.76 dB
RGB Max error: 48 RMSE: 3.065503 PSNR 38.40 dB
RGBA Max error: 48 RMSE: 2.654803 PSNR 39.65 dB
Red Max error: 48 RMSE: 3.116241 PSNR 38.26 dB
Green Max error: 44 RMSE: 2.957593 PSNR 38.71 dB
Blue Max error: 48 RMSE: 3.119874 PSNR 38.25 dB
Alpha Max error: 0 RMSE: 0.000000 PSNR 100.00 dB
Wrote PNG file Toof-a_unpacked.png
Wrote PNG file Toof-a_unpacked_alpha.png

Rich Geldreich · Answer 3 · Mon Jan 25 2021 05:46:25 GMT+0800 (China Standard Time)

This was compiled with VS 2019, x64 build.

Alec Miller · Answer 4 · Mon Jan 25 2021 05:59:31 GMT+0800 (China Standard Time)

I have only seen it out of kram on macOS-Intel. I haven't tried with the bc7enc command line tool. Just see if the logic above seems better I guess for now. Again, I have "3 color mode" disabled, but your output indicates it's enabled.

Rich Geldreich · Answer 5 · Mon Jan 25 2021 06:01:16 GMT+0800 (China Standard Time)

I disabled 3 color mode in that test, using -c:
"Level: 2, use 3-color mode: 0, use 3-color mode for black: 0, bc1_mode: 0"

I tried with and without -c.

AMD Compressonator also decompresses the generated .DDS file OK. I also single stepped through the code, and it appears to be working as designed. This is a tricky code path though, so perhaps there is a bug - how are you decompressing the block's data to 32-bit RGBA texels?

When min16==max16:

if min16>0 then min16 will be 1 less than max16, and the mask will be all 0's selecting the first BC1/DXT1 color (which is max16, or "color_0" in the MS docs).
https://docs.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-programming-guide-resources-block-compression#bc1
if min16==0, wet set the mask bytes to 0x55 (0b1010101 - or 1 for each texel), set max16 to 1 and min16 to 0. This selects min16, which is written to the second color in the block (color_1).

The check for "if (max16 < min16)" should never occur, so the mask will always be either 0 or 0x55.

Rich Geldreich · Answer 6 · Mon Jan 25 2021 06:02:59 GMT+0800 (China Standard Time)

Here are the generated files, including the .DDS file:

images.zip

Rich Geldreich · Answer 7 · Mon Jan 25 2021 06:04:40 GMT+0800 (China Standard Time)

PVRTexTool also loads and shows the generated .DDS file okay:

Alec Miller · Answer 8 · Mon Jan 25 2021 06:06:34 GMT+0800 (China Standard Time)

I decompress with BC7enc's decoder as well. So they should both line up. For some reason, with and without the change, I'm not seeing the artifacts right now, or I'd send you them. But I though selector b00 was min, and b01 was max, and those values looked flipped. Sorry, not trying to waste your time, but this was more obvious in my earlier tests.

Ugh, okay I assumed color0 was min, and color1 was max. That makes sense then. I don't actually decode BC format on macOS-Intel, since it's a supported format. So in this case, it would have all been in the encode/mip logic of kram, but I didn't see the incorrect blocks with my other BC encoders and same mip logic (squish and ate).

These were purple and green blocks that appeared in random spots on that mostly grayscale image. I should have screen captured it at the time. And they continued in long horizontal stripes, so the next block over would be less pink, until it went back to the grayscale like a gradient ramp.

Rich Geldreich · Answer 9 · Mon Jan 25 2021 06:09:26 GMT+0800 (China Standard Time)

Just to be sure, I compiled & ran under ubuntu 18 using clang, and the generated file is OK. If you can repo the bug with the bc7enc tool, then I can reproduce it here.

(Don't worry about the report, even if it's a false alarm - I appreciate it!)

Alec Miller · Answer 10 · Mon Jan 25 2021 06:14:32 GMT+0800 (China Standard Time)

Also don't use that PVRTexToolGUI version, it doesn't display or import srgb and colors correctly. Even the new one doesn't, but it has a nicer UI. The one on macOS doesn't decode BC4-7, so it's not as useful to me.

Alec Miller · Answer 11 · Sat May 22 2021 13:51:25 GMT+0800 (China Standard Time)

So now I'm seeing this with the bc7 encode. I have that same image that is all opaque, but many of the blocks are turned into mode 6 blocks. These blocks are wasting 2x7 bits encoding alpha, since my version of bc7enc doesn't support mode 3 or rgb-only blocks I guess.

When this occurs, the blocks decode as 254 alpha in my compute sampler that just point samples from the bc7 texture. I decoded the A0/A1 chunks and they're 127, but not sure how bc7 converts that to a 255 alpha or 254 alpha. I know all the incoming source data is 255 since I can see it in the debugger. It's the same image as above (Toof-a.png).

I just got KTX2 into kram (no UASTC yet), and the debug modes are flagging all blocks as transparent (a = 254) on opaque images. My shaders take slower paths on non-opaques, so I want to make sure the images stay fully opaque. astc-enc and etc2comp are preserving the alpha, but I need to look into a gray -> color change in etc2comp blocks.

It's block 8,1 in the image (counting from 0). These are the gray values for the 4x4 block. The alpha are all 255. I see it take the opaque block encode path in the bc7 path. Just sharing this, in case you have time to track it down faster than I. I don't have any bc7 block printing calls, so I'll have to build those up. This might also be a good test case for the BC1 issue above, but I'll need to test that again.

251, 251, 251, 252,
249, 248, 247, 247,
244, 245, 243, 244,
241, 240, 240, 240

This is in bc7decomp
mode 6
R0=252 R1=240
G0=252, G1=240
B0=252, B1=240
A0=254, A1=254
P0=0, P1=0

I get correct bc7 alpha from ATE (apple tex encoder), but it may use the mode 3 blocks.

Alec Miller · Answer 12 · Sat May 22 2021 14:23:08 GMT+0800 (China Standard Time)

If it's at all helpful, here are the ATE version and the bc7enc encoded version. I can't see the block modes across the blocks in my tools yet. The ATE bc7 is fully opaque across all mips. The bc7enc version is non-opaque when I look at all mip levels and it gets worse even one mip level down, with alpha going to 248 and lower.

Toof-Archive.zip

Alec Miller · Answer 13 · Sun May 23 2021 02:04:47 GMT+0800 (China Standard Time)

So even when I run bc7decomp on that block immediately afterwards, the Alpha is FE (254). So there's seems to be something amiss with using Alpha blocks in BC7 to encode opaque pixels.

[0] = FEFBFBFB FEFBFBFB FEFBFBFB FEFCFCFC
[1] = FEF9F9F9 FEF8F8F8 FEF7F7F7 FEF7F7F7
[2] = FEF4F4F4 FEF5F5F5 FEF3F3F3 FEF4F4F4
[3] = FEF1F1F1 FEF0F0F0 FEF0F0F0 FEF0F0F0

I could see where alpha = 255 if the p0 and p1 bit was set, but they are both 0, but I assume that's limiting.
const uint32_t a1 = (uint32_t)((block.m_lo.m_a1 << 1) | block.m_hi.m_p1);

It looks like ATE is also using a mode6 block, but setting both p0/p1 to 1, so then the reconstruct generates 255.

Alec Miller · Answer 14 · Sun May 23 2021 04:22:49 GMT+0800 (China Standard Time)

Just to follow up, I was able to fix the alpha pulldown across the mips. It was a problem in my non-pow2 mipgen where weights needed to be renormalized and I needed some rounding/snapping to 8-bits. So now it's just constant blocks across the mip chain that are 254 when those pbits are 0 on the mode 6 blocks, but that's at the encode level.

Alec Miller · Answer 15 · Sun May 23 2021 06:54:06 GMT+0800 (China Standard Time)

This is one hacky fix inside find_optimal_solution() for this specific case. This always picks pbits of 1 if no alpha. May need to be specific to mode6 if rgb modes added later.

static uint64_t find_optimal_solution(uint32_t mode, vec4F xl, vec4F xh, const color_cell_compressor_params *pParams, color_cell_compressor_results *pResults)
{
....
	  for (int p = pParams->m_has_alpha ? 0 : 1; p < 2; p++)

Rich Geldreich · Answer 16 · Thu May 27 2021 15:20:34 GMT+0800 (China Standard Time)

Ok- I don't view this as a bug exactly, but I understand why you do. The encoder is trying to optimize for lowest overall error, and it chooses the p-bits that do this, independent of alpha. (Because to the encoder, alpha is just another channel - I can't assume it means transparency.) However, I can add a mode or feature that does what you want, because it's clearly important. Thanks for bringing this up!

Also the latest version of the encoder lives here:
https://github.com/richgel999/bc7enc_rdo

Alec Miller · Answer 17 · Fri May 28 2021 03:56:24 GMT+0800 (China Standard Time)

In bc7enc a flag to distinguish whether the mode supports alpha, and the existing one for whether the block does might help here. RGB1 style mode 3 would have full p-bits choice. But mode 6 using a transparent mode for a fully opaque texture would limit the p-bits choice to 11

And yeah, I'm a bit far back on bc7enc and astcencoder. Will make an update when I get a chance.