The clang/LLVM returns bad results for specific code optimizations
shinjich opened this issue · comments
The clang/LLVM + /Ox + /fp:fast + XMVectorRound built for SSE/SSE2 generated the bad results. XMVectorRound are used in trigonometric functions such as XMVectorSinCos, etc.
Repro code:
https://github.com/shinjich/directxmathtest/tree/master/VerifyRounding
The reproduction code compares the results of standard library, original XMVectorSinCos and modified version of XMVectorSinCos.
Results compiled and executed with..
/Ox /fp:fast (Bad results)
https://github.com/shinjich/directxmathtest/blob/master/VerifyRounding/_Result_fp_fast.txt
/Ox /fp:precise (Good results)
https://github.com/shinjich/directxmathtest/blob/master/VerifyRounding/_Result_fp_precise.txt
/Ox /fp:fast + SVML (Good results)
https://github.com/shinjich/directxmathtest/blob/master/VerifyRounding/_Result_fp_fast_SVML.txt
This issue was not seen with VC.
The issue here is that #pragma float_control
was not in place for clang as it is for MSVC, but more importantly this pragma on clang doesn't work correctly for SSE instrinsics :(
Therefore the recommendation is going to be to NOT use /fp:fast
for clang until this is resolved.