pow of literal 0.0f not optimized away in some cases

Question

pow of literal 0.0f not optimized away in some cases

tex3d opened this issue 2 months ago · comments

Description
When using float pow(float x, float y) intrinsic, supplying 0.0f for x should return 0.0f. This fails in some cases.

Steps to Reproduce

// dxc -T ps_6_0 -E main

#define TEST 0
// TEST = 0,3 - fails to optimize away log+mul+exp
// TEST = 1,2,4 - optimizes to return 0.0f

float main() : SV_Target0
{
#if TEST == 0
    float x = 0.0f;
    return pow(x, 2.2f);
#elif TEST == 1
    const float x = 0.0f;
    return pow(x, 2.2f);
#elif TEST == 2
    float x = 0.0f;
    return pow(x, 2.0f);
#elif TEST == 3
    return pow(saturate(0.0f), 2.2f);
#else
    return pow(0.0f, 2.2f);
#endif
}

See repro in Compiler Explorer.

Actual Behavior
When TEST = 0,3 - fails to optimize away log+mul+exp.
When TEST = 1,2,4 - optimizes to return 0.0f (expected behavior for all cases).

Environment

DXC version: 1.8.2403 and latest main
Host Operating System: any

Tex Riddell · Answer 1 · Tue Apr 23 2024 06:57:45 GMT+0800 (China Standard Time)

I think the root cause of this issue is that we generate DXIL op expansion log2+mul+exp (since we had no constant evaluation of the HL operation during CodeGen), and when it comes time to constant-evaluate the DXIL intrinsics, we bail if any FP exceptions occur (log2(0) = -INF), so the operations are not eliminated. In fact, I'm a bit surprised it succeeds for some of the cases here.

First, I think the general issue needs to be solved to produce the well-defined results for the DXIL ops instead of bailing in FP special cases.
Second, we should have evaluation of more HL operations on literals before lowering to DXIL expansions in the first place.
Third, for ops with reasonable special cases like this, we can detect these and replace them with the expected result during CodeGen.