5.6.0: test suite is failing in two units
kloczek opened this issue · comments
Looks like something is wrong and test suite is failing in two units
+ cd ROCm-Device-Libs-rocm-5.6.0
+ /usr/bin/ctest --test-dir x86_64-redhat-linux-gnu --output-on-failure --force-new-ctest-process -j48
Internal ctest changing into directory: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu
Test project /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu
Start 1: constant_fold_lgamma_r__gfx900
Start 2: constant_fold_lgamma_r__gfx1030
Start 3: compile_asin__gfx700
Start 4: compile_atan2__gfx700
Start 5: compile_atan2pi__gfx700
Start 6: compile_asin__gfx803
Start 7: compile_atan2__gfx803
Start 8: compile_atan2pi__gfx803
Start 9: compile_frexp__gfx600
Start 10: compile_frexp__gfx700
1/10 Test #3: compile_asin__gfx700 .............. Passed 0.14 sec
2/10 Test #10: compile_frexp__gfx700 ............. Passed 0.13 sec
3/10 Test #8: compile_atan2pi__gfx803 ........... Passed 0.14 sec
4/10 Test #6: compile_asin__gfx803 .............. Passed 0.15 sec
5/10 Test #7: compile_atan2__gfx803 ............. Passed 0.15 sec
6/10 Test #9: compile_frexp__gfx600 ............. Passed 0.15 sec
7/10 Test #1: constant_fold_lgamma_r__gfx900 .... Passed 0.17 sec
8/10 Test #2: constant_fold_lgamma_r__gfx1030 ... Passed 0.17 sec
9/10 Test #4: compile_atan2__gfx700 .............***Failed 0.27 sec
CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/RunCompileTest.cmake:24 (message):
Error compiling test: fatal error: error in backend: Cannot select:
0x5626e75ceee0: f16 = fma # D:1 0x5626e70ab360, 0x5626e75cee70,
0x5626e70ab360
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e75cee70: f16 = fmul # D:1 0x5626e75cebd0, 0x5626e75cee00
0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e75cee00: f16 = fma # D:1 0x5626e75cebd0, 0x5626e75ced20, ConstantFP:f16<APFloat(46397)>
0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e75ced20: f16 = fma # D:1 0x5626e75cebd0, ConstantFP:f16<APFloat(43490)>, ConstantFP:f16<APFloat(12563)>
0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75cecb0: f16 = ConstantFP<APFloat(43490)>
0x5626e75cec40: f16 = ConstantFP<APFloat(12563)>
0x5626e75ced90: f16 = ConstantFP<APFloat(46397)>
0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440
0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0>
0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90
0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80
0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e70ab280: i32 = TargetConstant<0>
0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0
0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0
0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930
0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1
0x5626e75ce8c0: f16 = Register %1
0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10
0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850
0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0
0x5626e75ce7e0: f16 = Register %0
In function: __ocml_atan2_f16
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/
and include the crash backtrace, preprocessed source, and associated run
script.
Stack dump:
0. Program arguments: /usr/bin/clang-16 -O3 -S -cl-std=CL2.0 -target
amdgcn-amd-amdhsa -mcpu=gfx700 -Xclang -finclude-default-header
--rocm-path=/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu
-mllvm -amdgpu-simplify-libcall=0 -o output.atan2.gfx700.s
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2.cl
1. <eof> parser at end of file
2. Code generation
3. Running pass 'CallGraph Pass Manager' on module
'/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2.cl'.
4. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function
'@__ocml_atan2_f16'
#0 0x00007ff6399e3911 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/lib64/libLLVM-16.so.0+0xde3911)
#1 0x00007ff6399e161a llvm::sys::RunSignalHandlers() (/lib64/libLLVM-16.so.0+0xde161a)
#2 0x00007ff6398ff72a llvm::CrashRecoveryContext::HandleExit(int) (/lib64/libLLVM-16.so.0+0xcff72a)
#3 0x00007ff6399db6c4 llvm::sys::Process::Exit(int, bool) (/lib64/libLLVM-16.so.0+0xddb6c4)
#4 0x00005626e5890fa6 (/usr/bin/clang-16+0x11fa6)
#5 0x00007ff6399116c5 llvm::report_fatal_error(llvm::Twine const&, bool) (/lib64/libLLVM-16.so.0+0xd116c5)
#6 0x00007ff63a325a42 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/lib64/libLLVM-16.so.0+0x1725a42)
#7 0x00007ff63a32a893 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/lib64/libLLVM-16.so.0+0x172a893)
#8 0x00007ff63a322ddc llvm::SelectionDAGISel::DoInstructionSelection() (/lib64/libLLVM-16.so.0+0x1722ddc)
#9 0x00007ff63a32d179 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/lib64/libLLVM-16.so.0+0x172d179)
#10 0x00007ff63a330461
llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&)
(/lib64/libLLVM-16.so.0+0x1730461)
#11 0x00007ff63a3321bc (/lib64/libLLVM-16.so.0+0x17321bc)
#12 0x00007ff639df9897 (/lib64/libLLVM-16.so.0+0x11f9897)
#13 0x00007ff639b4466b llvm::FPPassManager::runOnFunction(llvm::Function&)
(/lib64/libLLVM-16.so.0+0xf4466b)
#14 0x00007ff63afffd73 (/lib64/libLLVM-16.so.0+0x23ffd73)
#15 0x00007ff639b44ecc llvm::legacy::PassManagerImpl::run(llvm::Module&)
(/lib64/libLLVM-16.so.0+0xf44ecc)
#16 0x00007ff641ac1ead clang::EmitBackendOutput(clang::DiagnosticsEngine&,
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&,
clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef,
llvm::Module*, clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream>>)
(/lib64/libclang-cpp.so.16.0+0x14c1ead)
#17 0x00007ff641ac47fb (/lib64/libclang-cpp.so.16.0+0x14c47fb)
#18 0x00007ff642d60ba2 clang::ParseAST(clang::Sema&, bool, bool)
(/lib64/libclang-cpp.so.16.0+0x2760ba2)
#19 0x00007ff6424acf79 clang::FrontendAction::Execute()
(/lib64/libclang-cpp.so.16.0+0x1eacf79)
#20 0x00007ff642d6acfb
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
(/lib64/libclang-cpp.so.16.0+0x276acfb)
#21 0x00007ff642d6cbea
clang::ExecuteCompilerInvocation(clang::CompilerInstance*)
(/lib64/libclang-cpp.so.16.0+0x276cbea)
#22 0x00005626e5897e68 cc1_main(llvm::ArrayRef<char const*>, char const*,
void*) (/usr/bin/clang-16+0x18e68)
#23 0x00005626e589ddda (/usr/bin/clang-16+0x1edda)
#24 0x00007ff640de31f1 (/lib64/libclang-cpp.so.16.0+0x7e31f1)
#25 0x00007ff6398ff60a
llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>)
(/lib64/libLLVM-16.so.0+0xcff60a)
#26 0x00007ff640f7440f (/lib64/libclang-cpp.so.16.0+0x97440f)
#27 0x00007ff641a1b826
clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&,
clang::driver::Command const*&, bool) const
(/lib64/libclang-cpp.so.16.0+0x141b826)
#28 0x00007ff641a1bb6c
clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&,
llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&,
bool) const (/lib64/libclang-cpp.so.16.0+0x141bb6c)
#29 0x00007ff641a20aec
clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&,
llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&)
(/lib64/libclang-cpp.so.16.0+0x1420aec)
#30 0x00005626e58a0518 clang_main(int, char**) (/usr/bin/clang-16+0x21518)
#31 0x00007ff6384280ca __libc_start_call_main (/lib64/libc.so.6+0x280ca)
#32 0x00007ff63842818b __libc_start_main@GLIBC_2.2.5
(/lib64/libc.so.6+0x2818b)
#33 0x00005626e588f175 _start (/usr/bin/clang-16+0x10175)
clang-16: error: clang frontend command failed with exit code 70 (use -v to
see invocation)
clang version 16.0.6 (G2V 16.0.6-2.fc35)
Target: amdgcn-amd-amdhsa
Thread model: posix
InstalledDir: /usr/bin
clang-16: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-16: note: diagnostic msg: /tmp/atan2-84026d.cl
clang-16: note: diagnostic msg: /tmp/atan2-84026d.sh
clang-16: note: diagnostic msg:
********************
10/10 Test #5: compile_atan2pi__gfx700 ...........***Failed 0.27 sec
CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/RunCompileTest.cmake:24 (message):
Error compiling test: fatal error: error in backend: Cannot select:
0x5584c1b533e0: f16 = fma # D:1 0x5584c1b530d0, 0x5584c1b53300,
ConstantFP:f16<APFloat(13592)>
0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c1b53300: f16 = fma # D:1 0x5584c1b530d0, 0x5584c1b53220, ConstantFP:f16<APFloat(44722)>
0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c1b53220: f16 = fma # D:1 0x5584c1b530d0, ConstantFP:f16<APFloat(41917)>, ConstantFP:f16<APFloat(10897)>
0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010
0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0>
0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60
0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750
0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c162ee50: i32 = TargetConstant<0>
0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0
0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0
0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30
0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1
0x5584c1b52dc0: f16 = Register %1
0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10
0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50
0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0
0x5584c1b52ce0: f16 = Register %0
0x5584c1b531b0: f16 = ConstantFP<APFloat(41917)>
0x5584c1b53140: f16 = ConstantFP<APFloat(10897)>
0x5584c1b53290: f16 = ConstantFP<APFloat(44722)>
0x5584c1b53370: f16 = ConstantFP<APFloat(13592)>
In function: __ocml_atan2pi_f16
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/
and include the crash backtrace, preprocessed source, and associated run
script.
Stack dump:
0. Program arguments: /usr/bin/clang-16 -O3 -S -cl-std=CL2.0 -target
amdgcn-amd-amdhsa -mcpu=gfx700 -Xclang -finclude-default-header
--rocm-path=/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu
-mllvm -amdgpu-simplify-libcall=0 -o output.atan2pi.gfx700.s
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2pi.cl
1. <eof> parser at end of file
2. Code generation
3. Running pass 'CallGraph Pass Manager' on module
'/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2pi.cl'.
4. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function
'@__ocml_atan2pi_f16'
#0 0x00007fb1937e3911 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/lib64/libLLVM-16.so.0+0xde3911)
#1 0x00007fb1937e161a llvm::sys::RunSignalHandlers() (/lib64/libLLVM-16.so.0+0xde161a)
#2 0x00007fb1936ff72a llvm::CrashRecoveryContext::HandleExit(int) (/lib64/libLLVM-16.so.0+0xcff72a)
#3 0x00007fb1937db6c4 llvm::sys::Process::Exit(int, bool) (/lib64/libLLVM-16.so.0+0xddb6c4)
#4 0x00005584bf51bfa6 (/usr/bin/clang-16+0x11fa6)
#5 0x00007fb1937116c5 llvm::report_fatal_error(llvm::Twine const&, bool) (/lib64/libLLVM-16.so.0+0xd116c5)
#6 0x00007fb194125a42 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/lib64/libLLVM-16.so.0+0x1725a42)
#7 0x00007fb19412a893 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/lib64/libLLVM-16.so.0+0x172a893)
#8 0x00007fb194122ddc llvm::SelectionDAGISel::DoInstructionSelection() (/lib64/libLLVM-16.so.0+0x1722ddc)
#9 0x00007fb19412d179 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/lib64/libLLVM-16.so.0+0x172d179)
#10 0x00007fb194130461
llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&)
(/lib64/libLLVM-16.so.0+0x1730461)
#11 0x00007fb1941321bc (/lib64/libLLVM-16.so.0+0x17321bc)
#12 0x00007fb193bf9897 (/lib64/libLLVM-16.so.0+0x11f9897)
#13 0x00007fb19394466b llvm::FPPassManager::runOnFunction(llvm::Function&)
(/lib64/libLLVM-16.so.0+0xf4466b)
#14 0x00007fb194dffd73 (/lib64/libLLVM-16.so.0+0x23ffd73)
#15 0x00007fb193944ecc llvm::legacy::PassManagerImpl::run(llvm::Module&)
(/lib64/libLLVM-16.so.0+0xf44ecc)
#16 0x00007fb19b8c1ead clang::EmitBackendOutput(clang::DiagnosticsEngine&,
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&,
clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef,
llvm::Module*, clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream>>)
(/lib64/libclang-cpp.so.16.0+0x14c1ead)
#17 0x00007fb19b8c47fb (/lib64/libclang-cpp.so.16.0+0x14c47fb)
#18 0x00007fb19cb60ba2 clang::ParseAST(clang::Sema&, bool, bool)
(/lib64/libclang-cpp.so.16.0+0x2760ba2)
#19 0x00007fb19c2acf79 clang::FrontendAction::Execute()
(/lib64/libclang-cpp.so.16.0+0x1eacf79)
#20 0x00007fb19cb6acfb
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
(/lib64/libclang-cpp.so.16.0+0x276acfb)
#21 0x00007fb19cb6cbea
clang::ExecuteCompilerInvocation(clang::CompilerInstance*)
(/lib64/libclang-cpp.so.16.0+0x276cbea)
#22 0x00005584bf522e68 cc1_main(llvm::ArrayRef<char const*>, char const*,
void*) (/usr/bin/clang-16+0x18e68)
#23 0x00005584bf528dda (/usr/bin/clang-16+0x1edda)
#24 0x00007fb19abe31f1 (/lib64/libclang-cpp.so.16.0+0x7e31f1)
#25 0x00007fb1936ff60a
llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>)
(/lib64/libLLVM-16.so.0+0xcff60a)
#26 0x00007fb19ad7440f (/lib64/libclang-cpp.so.16.0+0x97440f)
#27 0x00007fb19b81b826
clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&,
clang::driver::Command const*&, bool) const
(/lib64/libclang-cpp.so.16.0+0x141b826)
#28 0x00007fb19b81bb6c
clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&,
llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&,
bool) const (/lib64/libclang-cpp.so.16.0+0x141bb6c)
#29 0x00007fb19b820aec
clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&,
llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&)
(/lib64/libclang-cpp.so.16.0+0x1420aec)
#30 0x00005584bf52b518 clang_main(int, char**) (/usr/bin/clang-16+0x21518)
#31 0x00007fb1922280ca __libc_start_call_main (/lib64/libc.so.6+0x280ca)
#32 0x00007fb19222818b __libc_start_main@GLIBC_2.2.5
(/lib64/libc.so.6+0x2818b)
#33 0x00005584bf51a175 _start (/usr/bin/clang-16+0x10175)
clang-16: error: clang frontend command failed with exit code 70 (use -v to
see invocation)
clang version 16.0.6 (G2V 16.0.6-2.fc35)
Target: amdgcn-amd-amdhsa
Thread model: posix
InstalledDir: /usr/bin
clang-16: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-16: note: diagnostic msg: /tmp/atan2pi-4fa3b0.cl
clang-16: note: diagnostic msg: /tmp/atan2pi-4fa3b0.sh
clang-16: note: diagnostic msg:
********************
80% tests passed, 2 tests failed out of 10
Total Test time (real) = 0.28 sec
The following tests FAILED:
4 - compile_atan2__gfx700 (Failed)
5 - compile_atan2pi__gfx700 (Failed)
Errors while running CTest
I'm using LLVM 16.0.6.
I'm not sure what more I should provide as details about my build env.
gentle ping .. 🤔
Sorry I missed this earlier. We'll take a look.
This could happen with mismatched compiler and device libs versions. The 16-bit target feature was applied to all the half functions, which would just break any non-legal half targets. The current device libs is free of the 16-bit features and should work on all subtargets. This should work if you use the latest device libs
OK so what exactly it means?
cmake messed something?🤔
OK so what exactly it means? cmake messed something?thinking
This was fixed by 439adec (which depended on quite a lot of patches in the library and compiler to avoid dependence on the 16-bit feature)
Do you have any plans to make new release as looks like last one has some issues?
BTW is it possible to change tagging convention from rocm-<version>
to just <version>
? 🤔
Tar ball automatically generated from git tag base directory is assembled from -`.
Do you have any plans to make new release as looks like last one has some issues?
Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.
BTW is it possible to change tagging convention from
rocm-<version>
to just<version>
? thinking Tar ball automatically generated from git tag base directory is assembled from -`.
Not sure, I would hope this would be consistent across all rocm projects
Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.
So this issue affects only test suite and not actual generated bytecode? 🤔
Not sure, I would hope this would be consistent across all rocm projects
I understand.
Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.
So this issue affects only test suite and not actual generated bytecode? thinking
It shows half was broken on antique targets which don't have native half support. For OpenCL clang wouldn't report the half extension as available, so an ordinary use wouldn't run into this
FWIW, every compiled device code uses at least some of the library byte code. The release can't happen without it working properly.
Just tested 5.7.0 and looks like now test suite is failing in 3 units
+ cd ROCm-Device-Libs-rocm-5.7.0
+ /usr/bin/ctest --test-dir x86_64-redhat-linux-gnu --output-on-failure --force-new-ctest-process -j48 ' '
Internal ctest changing into directory: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/x86_64-redhat-linux-gnu
Test project /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/x86_64-redhat-linux-gnu
Start 1: constant_fold_lgamma_r__gfx900
Start 2: constant_fold_lgamma_r__gfx1030
Start 3: compile_asin__gfx700
Start 4: compile_atan2__gfx700
Start 5: compile_atan2pi__gfx700
Start 6: compile_asin__gfx803
Start 7: compile_atan2__gfx803
Start 8: compile_atan2pi__gfx803
Start 9: compile_frexp__gfx600
Start 10: compile_frexp__gfx700
1/10 Test #3: compile_asin__gfx700 .............. Passed 0.27 sec
2/10 Test #1: constant_fold_lgamma_r__gfx900 .... Passed 0.28 sec
3/10 Test #2: constant_fold_lgamma_r__gfx1030 ... Passed 0.28 sec
4/10 Test #4: compile_atan2__gfx700 .............***Failed 0.28 sec
CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2.cl:5:12:
error: GFX700: expected string not found in input
// GFX700: v_cvt_f32_f16
^
output.atan2.gfx700.s:7:16: note: scanning from here
test_atan2_f16: ; @test_atan2_f16
^
output.atan2.gfx700.s:15:2: note: possible intended match here
v_cvt_f16_f32_e32 v2, s0
^
Input file: output.atan2.gfx700.s
Check file:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx700"
3: .protected test_atan2_f16 ; -- Begin function test_atan2_f16
4: .globl test_atan2_f16
5: .p2align 8
6: .type test_atan2_f16,@function
7: test_atan2_f16: ; @test_atan2_f16
check:4 ^~~~~~~~~~~~~~~
check:5'0 X~~~~~~~~~~~~~~~~~~ error: no match found
8: ; %bb.0:
check:5'0 ~~~~~~~~~
9: s_load_dwordx2 s[0:1], s[4:5], 0x0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10: v_lshlrev_b32_e32 v0, 1, v0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11: s_add_i32 s6, s6, s9
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
12: s_mov_b32 flat_scratch_lo, s7
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13: s_lshr_b32 flat_scratch_hi, s6, 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
14: s_waitcnt lgkmcnt(0)
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
15: v_cvt_f16_f32_e32 v2, s0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
check:5'1 ? possible intended match
16: v_mov_b32_e32 v1, s1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
17: v_add_i32_e32 v0, vcc, s0, v0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18: v_addc_u32_e32 v1, vcc, 0, v1, vcc
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19: flat_store_short v[0:1], v2
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20: s_endpgm
check:5'0 ~~~~~~~~~~
21: .section .rodata,#alloc
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
22: .p2align 6, 0x0
check:5'0 ~~~~~~~~~~~~~~~~~
23: .amdhsa_kernel test_atan2_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24: .amdhsa_group_segment_fixed_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: .amdhsa_private_segment_fixed_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26: .amdhsa_kernarg_size 24
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
27: .amdhsa_user_sgpr_count 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
28: .amdhsa_user_sgpr_private_segment_buffer 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: .amdhsa_user_sgpr_dispatch_ptr 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: .amdhsa_user_sgpr_queue_ptr 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31: .amdhsa_user_sgpr_kernarg_segment_ptr 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: .amdhsa_user_sgpr_dispatch_id 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33: .amdhsa_user_sgpr_flat_scratch_init 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: .amdhsa_user_sgpr_private_segment_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
36: .amdhsa_system_sgpr_workgroup_id_x 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: .amdhsa_system_sgpr_workgroup_id_y 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: .amdhsa_system_sgpr_workgroup_id_z 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_system_sgpr_workgroup_info 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_system_vgpr_workitem_id 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_next_free_vgpr 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_next_free_sgpr 10
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_float_round_mode_32 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_float_round_mode_16_64 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_float_denorm_mode_32 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_float_denorm_mode_16_64 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_dx10_clamp 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_ieee_mode 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_exception_fp_ieee_invalid_op 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_exception_fp_denorm_src 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_exception_fp_ieee_div_zero 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_exception_fp_ieee_overflow 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_exception_fp_ieee_underflow 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_exception_fp_ieee_inexact 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_exception_int_div_zero 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .end_amdhsa_kernel
check:5'0 ~~~~~~~~~~~~~~~~~~~~
57: .text
check:5'0 ~~~~~~~
58: .Lfunc_end0:
check:5'0 ~~~~~~~~~~~~~
59: .size test_atan2_f16, .Lfunc_end0-test_atan2_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: ; -- End function
check:5'0 ~~~~~~~~~~~~~~~~~~~
61: .section .AMDGPU.csdata
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
62: ; Kernel info:
check:5'0 ~~~~~~~~~~~~~~~
63: ; codeLenInByte = 52
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
64: ; NumSgprs: 14
check:5'0 ~~~~~~~~~~~~~~~
65: ; NumVgprs: 3
check:5'0 ~~~~~~~~~~~~~~
66: ; ScratchSize: 0
check:5'0 ~~~~~~~~~~~~~~~~~
67: ; MemoryBound: 0
check:5'0 ~~~~~~~~~~~~~~~~~
68: ; FloatMode: 192
check:5'0 ~~~~~~~~~~~~~~~~~
69: ; IeeeMode: 1
check:5'0 ~~~~~~~~~~~~~~
70: ; LDSByteSize: 0 bytes/workgroup (compile time only)
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71: ; SGPRBlocks: 1
check:5'0 ~~~~~~~~~~~~~~~~
72: ; VGPRBlocks: 0
check:5'0 ~~~~~~~~~~~~~~~~
73: ; NumSGPRsForWavesPerEU: 14
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
74: ; NumVGPRsForWavesPerEU: 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
75: ; Occupancy: 10
check:5'0 ~~~~~~~~~~~~~~~~
76: ; WaveLimiterHint : 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
77: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
78: ; COMPUTE_PGM_RSRC2:USER_SGPR: 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
79: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
82: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
84: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)"
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
85: .section ".note.GNU-stack"
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86: .addrsig
check:5'0 ~~~~~~~~~~
87: .amdgpu_metadata
check:5'0 ~~~~~~~~~~~~~~~~~~
88: ---
check:5'0 ~~~~
89: amdhsa.kernels:
check:5'0 ~~~~~~~~~~~~~~~~
90: - .args:
check:5'0 ~~~~~~~~~~
91: - .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
92: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
93: .offset: 0
check:5'0 ~~~~~~~~~~~~
94: .size: 8
check:5'0 ~~~~~~~~~~
95: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
96: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
97: - .access: read_only
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
98: .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~
99: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
100: .offset: 8
check:5'0 ~~~~~~~~~~~~
101: .size: 8
check:5'0 ~~~~~~~~~~
102: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
103: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: - .access: read_only
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
105: .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~
106: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
107: .offset: 16
check:5'0 ~~~~~~~~~~~~~
108: .size: 8
check:5'0 ~~~~~~~~~~
109: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
110: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
111: .group_segment_fixed_size: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112: .kernarg_segment_align: 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
113: .kernarg_segment_size: 24
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
114: .language: OpenCL C
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
115: .language_version:
check:5'0 ~~~~~~~~~~~~~~~~~~~~
116: - 2
check:5'0 ~~~~~
117: - 0
check:5'0 ~~~~~
118: .max_flat_workgroup_size: 256
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119: .name: test_atan2_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~
120: .private_segment_fixed_size: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
121: .sgpr_count: 14
check:5'0 ~~~~~~~~~~~~~~~~~
122: .sgpr_spill_count: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
123: .symbol: test_atan2_f16.kd
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124: .vgpr_count: 3
check:5'0 ~~~~~~~~~~~~~~~~
125: .vgpr_spill_count: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
126: .wavefront_size: 64
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
127: amdhsa.target: amdgcn-amd-amdhsa--gfx700
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128: amdhsa.version:
check:5'0 ~~~~~~~~~~~~~~~~
129: - 1
check:5'0 ~~~~~
130: - 1
check:5'0 ~~~~~
131: ...
check:5'0 ~~~~
132:
check:5'0 ~
133: .end_amdgpu_metadata
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
5/10 Test #10: compile_frexp__gfx700 ............. Passed 0.27 sec
6/10 Test #6: compile_asin__gfx803 .............. Passed 0.29 sec
7/10 Test #7: compile_atan2__gfx803 ............. Passed 0.30 sec
8/10 Test #8: compile_atan2pi__gfx803 ........... Passed 0.31 sec
9/10 Test #5: compile_atan2pi__gfx700 ...........***Failed 0.32 sec
CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2pi.cl:5:12:
error: GFX700: expected string not found in input
// GFX700: v_cvt_f32_f16
^
output.atan2pi.gfx700.s:7:18: note: scanning from here
test_atan2pi_f16: ; @test_atan2pi_f16
^
output.atan2pi.gfx700.s:15:2: note: possible intended match here
v_cvt_f16_f32_e32 v2, s0
^
Input file: output.atan2pi.gfx700.s
Check file:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2pi.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx700"
3: .protected test_atan2pi_f16 ; -- Begin function test_atan2pi_f16
4: .globl test_atan2pi_f16
5: .p2align 8
6: .type test_atan2pi_f16,@function
7: test_atan2pi_f16: ; @test_atan2pi_f16
check:4 ^~~~~~~~~~~~~~~~~
check:5'0 X~~~~~~~~~~~~~~~~~~~~ error: no match found
8: ; %bb.0:
check:5'0 ~~~~~~~~~
9: s_load_dwordx2 s[0:1], s[4:5], 0x0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10: v_lshlrev_b32_e32 v0, 1, v0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11: s_add_i32 s6, s6, s9
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
12: s_mov_b32 flat_scratch_lo, s7
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13: s_lshr_b32 flat_scratch_hi, s6, 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
14: s_waitcnt lgkmcnt(0)
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
15: v_cvt_f16_f32_e32 v2, s0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
check:5'1 ? possible intended match
16: v_mov_b32_e32 v1, s1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
17: v_add_i32_e32 v0, vcc, s0, v0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18: v_addc_u32_e32 v1, vcc, 0, v1, vcc
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19: flat_store_short v[0:1], v2
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20: s_endpgm
check:5'0 ~~~~~~~~~~
21: .section .rodata,#alloc
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
22: .p2align 6, 0x0
check:5'0 ~~~~~~~~~~~~~~~~~
23: .amdhsa_kernel test_atan2pi_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24: .amdhsa_group_segment_fixed_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: .amdhsa_private_segment_fixed_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26: .amdhsa_kernarg_size 24
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
27: .amdhsa_user_sgpr_count 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
28: .amdhsa_user_sgpr_private_segment_buffer 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: .amdhsa_user_sgpr_dispatch_ptr 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: .amdhsa_user_sgpr_queue_ptr 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31: .amdhsa_user_sgpr_kernarg_segment_ptr 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: .amdhsa_user_sgpr_dispatch_id 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33: .amdhsa_user_sgpr_flat_scratch_init 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: .amdhsa_user_sgpr_private_segment_size 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
36: .amdhsa_system_sgpr_workgroup_id_x 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: .amdhsa_system_sgpr_workgroup_id_y 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: .amdhsa_system_sgpr_workgroup_id_z 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_system_sgpr_workgroup_info 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_system_vgpr_workitem_id 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_next_free_vgpr 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_next_free_sgpr 10
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_float_round_mode_32 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_float_round_mode_16_64 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_float_denorm_mode_32 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_float_denorm_mode_16_64 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_dx10_clamp 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_ieee_mode 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_exception_fp_ieee_invalid_op 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_exception_fp_denorm_src 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_exception_fp_ieee_div_zero 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_exception_fp_ieee_overflow 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_exception_fp_ieee_underflow 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_exception_fp_ieee_inexact 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_exception_int_div_zero 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .end_amdhsa_kernel
check:5'0 ~~~~~~~~~~~~~~~~~~~~
57: .text
check:5'0 ~~~~~~~
58: .Lfunc_end0:
check:5'0 ~~~~~~~~~~~~~
59: .size test_atan2pi_f16, .Lfunc_end0-test_atan2pi_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: ; -- End function
check:5'0 ~~~~~~~~~~~~~~~~~~~
61: .section .AMDGPU.csdata
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
62: ; Kernel info:
check:5'0 ~~~~~~~~~~~~~~~
63: ; codeLenInByte = 52
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
64: ; NumSgprs: 14
check:5'0 ~~~~~~~~~~~~~~~
65: ; NumVgprs: 3
check:5'0 ~~~~~~~~~~~~~~
66: ; ScratchSize: 0
check:5'0 ~~~~~~~~~~~~~~~~~
67: ; MemoryBound: 0
check:5'0 ~~~~~~~~~~~~~~~~~
68: ; FloatMode: 192
check:5'0 ~~~~~~~~~~~~~~~~~
69: ; IeeeMode: 1
check:5'0 ~~~~~~~~~~~~~~
70: ; LDSByteSize: 0 bytes/workgroup (compile time only)
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71: ; SGPRBlocks: 1
check:5'0 ~~~~~~~~~~~~~~~~
72: ; VGPRBlocks: 0
check:5'0 ~~~~~~~~~~~~~~~~
73: ; NumSGPRsForWavesPerEU: 14
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
74: ; NumVGPRsForWavesPerEU: 3
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
75: ; Occupancy: 10
check:5'0 ~~~~~~~~~~~~~~~~
76: ; WaveLimiterHint : 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
77: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
78: ; COMPUTE_PGM_RSRC2:USER_SGPR: 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
79: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
82: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
84: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)"
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
85: .section ".note.GNU-stack"
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86: .addrsig
check:5'0 ~~~~~~~~~~
87: .amdgpu_metadata
check:5'0 ~~~~~~~~~~~~~~~~~~
88: ---
check:5'0 ~~~~
89: amdhsa.kernels:
check:5'0 ~~~~~~~~~~~~~~~~
90: - .args:
check:5'0 ~~~~~~~~~~
91: - .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
92: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
93: .offset: 0
check:5'0 ~~~~~~~~~~~~
94: .size: 8
check:5'0 ~~~~~~~~~~
95: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
96: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
97: - .access: read_only
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
98: .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~
99: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
100: .offset: 8
check:5'0 ~~~~~~~~~~~~
101: .size: 8
check:5'0 ~~~~~~~~~~
102: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
103: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: - .access: read_only
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
105: .address_space: global
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~
106: .is_restrict: true
check:5'0 ~~~~~~~~~~~~~~~~~~~~
107: .offset: 16
check:5'0 ~~~~~~~~~~~~~
108: .size: 8
check:5'0 ~~~~~~~~~~
109: .type_name: 'half*'
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
110: .value_kind: global_buffer
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
111: .group_segment_fixed_size: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112: .kernarg_segment_align: 8
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
113: .kernarg_segment_size: 24
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
114: .language: OpenCL C
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
115: .language_version:
check:5'0 ~~~~~~~~~~~~~~~~~~~~
116: - 2
check:5'0 ~~~~~
117: - 0
check:5'0 ~~~~~
118: .max_flat_workgroup_size: 256
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119: .name: test_atan2pi_f16
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
120: .private_segment_fixed_size: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
121: .sgpr_count: 14
check:5'0 ~~~~~~~~~~~~~~~~~
122: .sgpr_spill_count: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
123: .symbol: test_atan2pi_f16.kd
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124: .vgpr_count: 3
check:5'0 ~~~~~~~~~~~~~~~~
125: .vgpr_spill_count: 0
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
126: .wavefront_size: 64
check:5'0 ~~~~~~~~~~~~~~~~~~~~~
127: amdhsa.target: amdgcn-amd-amdhsa--gfx700
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128: amdhsa.version:
check:5'0 ~~~~~~~~~~~~~~~~
129: - 1
check:5'0 ~~~~~
130: - 1
check:5'0 ~~~~~
131: ...
check:5'0 ~~~~
132:
check:5'0 ~
133: .end_amdgpu_metadata
check:5'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
10/10 Test #9: compile_frexp__gfx600 .............***Failed 0.33 sec
CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl:8:16:
error: GCN-DAG: expected string not found in input
// GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8
^
output.frexp.gfx600.s:7:16: note: scanning from here
test_frexp_f32: ; @test_frexp_f32
^
output.frexp.gfx600.s:17:2: note: possible intended match here
v_mov_b32_e32 v3, 0x1f8
^
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl:34:16:
error: CHECK-DAG: expected string not found in input
// GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8
^
output.frexp.gfx600.s:98:16: note: scanning from here
test_frexp_f64: ; @test_frexp_f64
^
output.frexp.gfx600.s:108:2: note: possible intended match here
v_mov_b32_e32 v7, 0x1f8
^
Input file: output.frexp.gfx600.s
Check file:
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"
3: .protected test_frexp_f32 ; -- Begin function test_frexp_f32
4: .globl test_frexp_f32
5: .p2align 8
6: .type test_frexp_f32,@function
7: test_frexp_f32: ; @test_frexp_f32
label:7'0 ^~~~~~~~~~~~~~~
label:7'1 ^~~~~~~~~~~~~~~
dag:8'0 X~~~~~~~~~~~~~~~~~~ error: no match found
8: ; %bb.0:
dag:8'0 ~~~~~~~~~
9: s_load_dwordx2 s[0:1], s[4:5], 0x4
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10: s_mov_b32 s3, 0x100f000
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
11: s_mov_b32 s2, 0
dag:8'0 ~~~~~~~~~~~~~~~~~
12: v_lshlrev_b32_e32 v0, 2, v0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13: v_mov_b32_e32 v1, 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~
14: s_waitcnt lgkmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
15: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16: s_load_dwordx4 s[4:7], s[4:5], 0x0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17: v_mov_b32_e32 v3, 0x1f8
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
dag:8'1 ? possible intended match
18: s_waitcnt lgkmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
19: s_mov_b64 s[0:1], s[4:5]
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
20: s_waitcnt vmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~
21: v_frexp_mant_f32_e32 v5, v2
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22: v_cmp_class_f32_e32 vcc, v2, v3
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
23: v_frexp_exp_i32_f32_e32 v4, v2
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24: v_cndmask_b32_e32 v2, v2, v5, vcc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: v_cndmask_b32_e32 v3, 0, v4, vcc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27: s_mov_b64 s[0:1], s[6:7]
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
28: buffer_store_dword v3, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: s_endpgm
dag:8'0 ~~~~~~~~~~
30: .section .rodata,#alloc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
31: .p2align 6, 0x0
dag:8'0 ~~~~~~~~~~~~~~~~~
32: .amdhsa_kernel test_frexp_f32
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33: .amdhsa_group_segment_fixed_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: .amdhsa_private_segment_fixed_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35: .amdhsa_kernarg_size 24
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
36: .amdhsa_user_sgpr_count 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: .amdhsa_user_sgpr_private_segment_buffer 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: .amdhsa_user_sgpr_dispatch_ptr 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_user_sgpr_queue_ptr 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_user_sgpr_dispatch_id 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_user_sgpr_flat_scratch_init 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_user_sgpr_private_segment_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_system_sgpr_workgroup_id_x 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_system_sgpr_workgroup_id_y 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_system_sgpr_workgroup_id_z 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_system_sgpr_workgroup_info 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_system_vgpr_workitem_id 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_next_free_vgpr 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_next_free_sgpr 8
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_float_round_mode_32 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_float_round_mode_16_64 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_float_denorm_mode_32 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_float_denorm_mode_16_64 3
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .amdhsa_dx10_clamp 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
57: .amdhsa_ieee_mode 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~
58: .amdhsa_exception_fp_ieee_invalid_op 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59: .amdhsa_exception_fp_denorm_src 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: .amdhsa_exception_fp_ieee_div_zero 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
61: .amdhsa_exception_fp_ieee_overflow 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62: .amdhsa_exception_fp_ieee_underflow 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63: .amdhsa_exception_fp_ieee_inexact 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
64: .amdhsa_exception_int_div_zero 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65: .end_amdhsa_kernel
dag:8'0 ~~~~~~~~~~~~~~~~~~~~
66: .text
dag:8'0 ~~~~~~~
67: .Lfunc_end0:
dag:8'0 ~~~~~~~~~~~~~
68: .size test_frexp_f32, .Lfunc_end0-test_frexp_f32
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69: ; -- End function
dag:8'0 ~~~~~~~~~~~~~~~~~~~
70: .section .AMDGPU.csdata
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
71: ; Kernel info:
dag:8'0 ~~~~~~~~~~~~~~~
72: ; codeLenInByte = 104
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
73: ; NumSgprs: 10
dag:8'0 ~~~~~~~~~~~~~~~
74: ; NumVgprs: 6
dag:8'0 ~~~~~~~~~~~~~~
75: ; ScratchSize: 0
dag:8'0 ~~~~~~~~~~~~~~~~~
76: ; MemoryBound: 0
dag:8'0 ~~~~~~~~~~~~~~~~~
77: ; FloatMode: 192
dag:8'0 ~~~~~~~~~~~~~~~~~
78: ; IeeeMode: 1
dag:8'0 ~~~~~~~~~~~~~~
79: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80: ; SGPRBlocks: 1
dag:8'0 ~~~~~~~~~~~~~~~~
81: ; VGPRBlocks: 1
dag:8'0 ~~~~~~~~~~~~~~~~
82: ; NumSGPRsForWavesPerEU: 10
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83: ; NumVGPRsForWavesPerEU: 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
84: ; Occupancy: 10
dag:8'0 ~~~~~~~~~~~~~~~~
85: ; WaveLimiterHint : 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
86: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
87: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
88: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
89: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93: .text
dag:8'0 ~~~~~~~
94: .protected test_frexp_f64 ; -- Begin function test_frexp_f64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95: .globl test_frexp_f64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~
96: .p2align 8
dag:8'0 ~~~~~~~~~~~~
97: .type test_frexp_f64,@function
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98: test_frexp_f64: ; @test_frexp_f64
label:32 ^~~~~~~~~~~~~~~
dag:8'0 ~~~~~~~~~~~~~~~
dag:34'0 X~~~~~~~~~~~~~~~~~~ error: no match found
99: ; %bb.0:
dag:34'0 ~~~~~~~~~
100: s_load_dwordx2 s[0:1], s[4:5], 0x4
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101: s_mov_b32 s3, 0x100f000
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
102: s_mov_b32 s2, 0
dag:34'0 ~~~~~~~~~~~~~~~~~
103: v_lshlrev_b32_e32 v1, 3, v0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: v_mov_b32_e32 v2, 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
105: s_waitcnt lgkmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
106: buffer_load_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107: s_load_dwordx4 s[4:7], s[4:5], 0x0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108: v_mov_b32_e32 v7, 0x1f8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
dag:34'1 ? possible intended match
109: s_waitcnt lgkmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
110: s_mov_b64 s[0:1], s[4:5]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
111: s_waitcnt vmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
112: v_frexp_mant_f64_e32 v[5:6], v[3:4]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
113: v_cmp_class_f64_e32 vcc, v[3:4], v7
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114: v_frexp_exp_i32_f64_e32 v8, v[3:4]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115: v_cndmask_b32_e32 v4, v4, v6, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
116: v_cndmask_b32_e32 v3, v3, v5, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
117: v_cndmask_b32_e32 v7, 0, v8, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
118: buffer_store_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119: s_mov_b64 s[0:1], s[6:7]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
120: v_lshlrev_b32_e32 v1, 2, v0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
121: buffer_store_dword v7, v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122: s_endpgm
dag:34'0 ~~~~~~~~~~
123: .section .rodata,#alloc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
124: .p2align 6, 0x0
dag:34'0 ~~~~~~~~~~~~~~~~~
125: .amdhsa_kernel test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126: .amdhsa_group_segment_fixed_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
127: .amdhsa_private_segment_fixed_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128: .amdhsa_kernarg_size 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
129: .amdhsa_user_sgpr_count 6
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
130: .amdhsa_user_sgpr_private_segment_buffer 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
131: .amdhsa_user_sgpr_dispatch_ptr 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132: .amdhsa_user_sgpr_queue_ptr 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134: .amdhsa_user_sgpr_dispatch_id 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
135: .amdhsa_user_sgpr_flat_scratch_init 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
136: .amdhsa_user_sgpr_private_segment_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138: .amdhsa_system_sgpr_workgroup_id_x 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139: .amdhsa_system_sgpr_workgroup_id_y 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140: .amdhsa_system_sgpr_workgroup_id_z 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141: .amdhsa_system_sgpr_workgroup_info 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142: .amdhsa_system_vgpr_workitem_id 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143: .amdhsa_next_free_vgpr 9
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
144: .amdhsa_next_free_sgpr 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
145: .amdhsa_float_round_mode_32 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146: .amdhsa_float_round_mode_16_64 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147: .amdhsa_float_denorm_mode_32 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148: .amdhsa_float_denorm_mode_16_64 3
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149: .amdhsa_dx10_clamp 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
150: .amdhsa_ieee_mode 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
151: .amdhsa_exception_fp_ieee_invalid_op 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152: .amdhsa_exception_fp_denorm_src 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
153: .amdhsa_exception_fp_ieee_div_zero 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154: .amdhsa_exception_fp_ieee_overflow 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
155: .amdhsa_exception_fp_ieee_underflow 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156: .amdhsa_exception_fp_ieee_inexact 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157: .amdhsa_exception_int_div_zero 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
158: .end_amdhsa_kernel
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
159: .text
dag:34'0 ~~~~~~~
160: .Lfunc_end1:
dag:34'0 ~~~~~~~~~~~~~
161: .size test_frexp_f64, .Lfunc_end1-test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
162: ; -- End function
dag:34'0 ~~~~~~~~~~~~~~~~~~~
163: .section .AMDGPU.csdata
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
164: ; Kernel info:
dag:34'0 ~~~~~~~~~~~~~~~
165: ; codeLenInByte = 112
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
166: ; NumSgprs: 10
dag:34'0 ~~~~~~~~~~~~~~~
167: ; NumVgprs: 9
dag:34'0 ~~~~~~~~~~~~~~
168: ; ScratchSize: 0
dag:34'0 ~~~~~~~~~~~~~~~~~
169: ; MemoryBound: 0
dag:34'0 ~~~~~~~~~~~~~~~~~
170: ; FloatMode: 192
dag:34'0 ~~~~~~~~~~~~~~~~~
171: ; IeeeMode: 1
dag:34'0 ~~~~~~~~~~~~~~
172: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
173: ; SGPRBlocks: 1
dag:34'0 ~~~~~~~~~~~~~~~~
174: ; VGPRBlocks: 2
dag:34'0 ~~~~~~~~~~~~~~~~
175: ; NumSGPRsForWavesPerEU: 10
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
176: ; NumVGPRsForWavesPerEU: 9
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
177: ; Occupancy: 10
dag:34'0 ~~~~~~~~~~~~~~~~
178: ; WaveLimiterHint : 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
179: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
180: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
181: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
182: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
183: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
184: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
185: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
186: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)"
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
187: .section ".note.GNU-stack"
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188: .addrsig
dag:34'0 ~~~~~~~~~~
189: .amdgpu_metadata
dag:34'0 ~~~~~~~~~~~~~~~~~~
190: ---
dag:34'0 ~~~~
191: amdhsa.kernels:
dag:34'0 ~~~~~~~~~~~~~~~~
192: - .args:
dag:34'0 ~~~~~~~~~~
193: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
194: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
195: .offset: 0
dag:34'0 ~~~~~~~~~~~~
196: .size: 8
dag:34'0 ~~~~~~~~~~
197: .type_name: 'float*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
198: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
199: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
200: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
201: .offset: 8
dag:34'0 ~~~~~~~~~~~~
202: .size: 8
dag:34'0 ~~~~~~~~~~
203: .type_name: 'int*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
204: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
205: - .access: read_only
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
206: .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~
207: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
208: .offset: 16
dag:34'0 ~~~~~~~~~~~~~
209: .size: 8
dag:34'0 ~~~~~~~~~~
210: .type_name: 'float*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
211: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
212: .group_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
213: .kernarg_segment_align: 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
214: .kernarg_segment_size: 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
215: .language: OpenCL C
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
216: .language_version:
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
217: - 2
dag:34'0 ~~~~~
218: - 0
dag:34'0 ~~~~~
219: .max_flat_workgroup_size: 256
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
220: .name: test_frexp_f32
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
221: .private_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222: .sgpr_count: 10
dag:34'0 ~~~~~~~~~~~~~~~~~
223: .sgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
224: .symbol: test_frexp_f32.kd
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
225: .vgpr_count: 6
dag:34'0 ~~~~~~~~~~~~~~~~
226: .vgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
227: .wavefront_size: 64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
228: - .args:
dag:34'0 ~~~~~~~~~~
229: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
230: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
231: .offset: 0
dag:34'0 ~~~~~~~~~~~~
232: .size: 8
dag:34'0 ~~~~~~~~~~
233: .type_name: 'double*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
234: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
236: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
237: .offset: 8
dag:34'0 ~~~~~~~~~~~~
238: .size: 8
dag:34'0 ~~~~~~~~~~
239: .type_name: 'int*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
240: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241: - .access: read_only
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
242: .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~
243: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
244: .offset: 16
dag:34'0 ~~~~~~~~~~~~~
245: .size: 8
dag:34'0 ~~~~~~~~~~
246: .type_name: 'double*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
247: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248: .group_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249: .kernarg_segment_align: 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
250: .kernarg_segment_size: 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
251: .language: OpenCL C
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
252: .language_version:
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
253: - 2
dag:34'0 ~~~~~
254: - 0
dag:34'0 ~~~~~
255: .max_flat_workgroup_size: 256
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
256: .name: test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
257: .private_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258: .sgpr_count: 10
dag:34'0 ~~~~~~~~~~~~~~~~~
259: .sgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
260: .symbol: test_frexp_f64.kd
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
261: .vgpr_count: 9
dag:34'0 ~~~~~~~~~~~~~~~~
262: .vgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
263: .wavefront_size: 64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
264: amdhsa.target: amdgcn-amd-amdhsa--gfx600
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265: amdhsa.version:
dag:34'0 ~~~~~~~~~~~~~~~~
266: - 1
dag:34'0 ~~~~~
267: - 1
dag:34'0 ~~~~~
268: ...
dag:34'0 ~~~~
269:
dag:34'0 ~
270: .end_amdgpu_metadata
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
70% tests passed, 3 tests failed out of 10
Total Test time (real) = 0.35 sec
The following tests FAILED:
4 - compile_atan2__gfx700 (Failed)
5 - compile_atan2pi__gfx700 (Failed)
9 - compile_frexp__gfx600 (Failed)
Errors while running CTest
BTW: why .bc files are installed in $(prefix)/amdgcn/bitcode/
directory?
Are you sure that this is correct location? (is it should not be somewhere under $(libdir)? 🤔
Just tested 5.7.1 an test suite still fails in the same 3 units.
Any update about install path of .bc files? 🤔
.bc files are installed into clang resource dir (<LLVM_DIR>/lib/clang//lib/amdgcn/bitcode) in 6.0.
I'd suggest checking device-libs tests in 6.0, when it becomes available publicly.
Just checked 6.0.0 and all .bc files are still installed in the same path.
Test suite fails in the same 3 units as well.
Just tested 5.7.1 an test suite still fails in the same 3 units. Any update about install path of .bc files? 🤔
All tests should now pass as of ROCm/llvm-project@794ebef
BTW: why .bc files are installed in
$(prefix)/amdgcn/bitcode/
directory? Are you sure that this is correct location? (is it should not be somewhere under $(libdir)? 🤔
This has been an endless source of debate; they should be moving to the clang resource directory
Closing since the tests should now be passing, and this repository is closed and should be archived. The new location is under ROCm/llvm-project