XMLoadUNibble4, XMLoadU555 can cause crash at memory boundary

Question

XMLoadUNibble4, XMLoadU555 can cause crash at memory boundary

gegogi opened this issue 4 years ago · comments

Functions are using _mm_load_ps1(const float*) for SSE implementation.
But at the end of a memory block, this can access over as much as two bytes since XMUNIBBLE4 and XMU555 are packed types.

I bumped into a crash while converting a tightly packed RGBA4444 image to a RGBA8888 image using DirectXTex and it looks like it's happening while loading the final scanline of the source image. I ended up with reaching this SSE code.

inline XMVECTOR XM_CALLCONV XMLoadUNibble4
(
     const XMUNIBBLE4* pSource
)
{
    assert(pSource);
    static const XMVECTORI32 UNibble4And = { { { 0xF, 0xF0, 0xF00, 0xF000 } } };
    static const XMVECTORF32 UNibble4Mul = { { { 1.0f, 1.0f / 16.f, 1.0f / 256.f, 1.0f / 4096.f } } };
    // Get the 32 bit value and splat it
    XMVECTOR vResult = _mm_load_ps1(reinterpret_cast<const float *>(&pSource->v));
    // Mask off x, y and z
    vResult = _mm_and_ps(vResult,UNibble4And);
    // Convert to float
    vResult = _mm_cvtepi32_ps(_mm_castps_si128(vResult));
    // Normalize x, y, and z
    vResult = _mm_mul_ps(vResult,UNibble4Mul);
    return vResult;
}

Kyung-Kook Park · Answer 1 · Mon Apr 05 2021 15:36:07 GMT+0800 (China Standard Time)

I am just writing a new comment to check if this is a bug or not. Is Microsoft still maitaining the code?

Chuck Walbourn · Answer 2 · Tue Apr 06 2021 06:47:11 GMT+0800 (China Standard Time)

Sorry, I missed this bug report. I'll take a look at it for a future release.

Chuck Walbourn · Answer 3 · Thu Sep 09 2021 12:04:39 GMT+0800 (China Standard Time)

Same problem exists in load functions for XMUNIBBLE4, XMU555, XMU565, XMBYTEN2, XMBYTE2, XMUBYTEN2, and XMUBYTE2

Chuck Walbourn · Answer 4 · Sat Sep 11 2021 12:03:16 GMT+0800 (China Standard Time)

Note the _mm_loadu_si16 intrinsic is the right choice here, but it's only defined in VS 2017, clang v8, and GNUC 11 or later This will cause problems with GNUC 9/10 scenarios on WSL.

Chuck Walbourn · Answer 5 · Sat Sep 11 2021 13:20:04 GMT+0800 (China Standard Time)

Addressed the issue with GNUC 9, 10 in this commit