microsoft / DirectXShaderCompiler

This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pow of literal 0.0f not optimized away in some cases

tex3d opened this issue · comments

Description
When using float pow(float x, float y) intrinsic, supplying 0.0f for x should return 0.0f. This fails in some cases.

Steps to Reproduce

// dxc -T ps_6_0 -E main

#define TEST 0
// TEST = 0,3 - fails to optimize away log+mul+exp
// TEST = 1,2,4 - optimizes to return 0.0f

float main() : SV_Target0
{
#if TEST == 0
    float x = 0.0f;
    return pow(x, 2.2f);
#elif TEST == 1
    const float x = 0.0f;
    return pow(x, 2.2f);
#elif TEST == 2
    float x = 0.0f;
    return pow(x, 2.0f);
#elif TEST == 3
    return pow(saturate(0.0f), 2.2f);
#else
    return pow(0.0f, 2.2f);
#endif
}

See repro in Compiler Explorer.

Actual Behavior
When TEST = 0,3 - fails to optimize away log+mul+exp.
When TEST = 1,2,4 - optimizes to return 0.0f (expected behavior for all cases).

Environment

  • DXC version: 1.8.2403 and latest main
  • Host Operating System: any

I think the root cause of this issue is that we generate DXIL op expansion log2+mul+exp (since we had no constant evaluation of the HL operation during CodeGen), and when it comes time to constant-evaluate the DXIL intrinsics, we bail if any FP exceptions occur (log2(0) = -INF), so the operations are not eliminated. In fact, I'm a bit surprised it succeeds for some of the cases here.

First, I think the general issue needs to be solved to produce the well-defined results for the DXIL ops instead of bailing in FP special cases.
Second, we should have evaluation of more HL operations on literals before lowering to DXIL expansions in the first place.
Third, for ops with reasonable special cases like this, we can detect these and replace them with the expected result during CodeGen.