Native execution fails when only real-domain haswell kernels are registered
fgvanzee opened this issue · comments
Registering haswell
kernels in the haswell
subconfig as follows results in failures for native execution of level-3 operations on complex datatypes:
bli_cntx_set_l3_nat_ukrs
(
2,
// gemm
BLIS_GEMM_UKR, BLIS_FLOAT, bli_sgemm_haswell_asm_6x16, TRUE,
BLIS_GEMM_UKR, BLIS_DOUBLE, bli_dgemm_haswell_asm_6x8, TRUE,
cntx
);
Notice that these are the default row-preferential sgemm
and dgemm
ukernels. This seems to be the only change needed to trigger the failures of native execution.
Abbreviated testsuite output:
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_sgemm_nn_rrr 400 400 400 71.24 5.34e-09 PASS
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_dgemm_nn_rrr 400 400 400 44.40 1.32e-17 PASS
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_cgemm1m_nn_rrr 400 400 400 77.26 5.88e-09 PASS
blis_cgemm_nn_rrr 400 400 400 10.23 1.24e-02 FAILURE
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_zgemm1m_nn_rrr 400 400 400 40.00 2.92e-17 PASS
blis_zgemm_nn_rrr 400 400 400 7.58 1.36e-02 FAILURE
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_sgemm_nn_ccc 400 400 400 98.07 1.39e-08 PASS
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_dgemm_nn_ccc 400 400 400 41.89 2.76e-17 PASS
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_cgemm1m_nn_ccc 400 400 400 87.96 4.80e-09 PASS
blis_cgemm_nn_ccc 400 400 400 10.09 1.29e-02 FAILURE
% blis_<dt><op>_<params>_<stor> m n k gflops resid result
blis_zgemm1m_nn_ccc 400 400 400 44.10 8.72e-18 PASS
blis_zgemm_nn_ccc 400 400 400 7.48 1.27e-02 FAILURE
I stumbled upon this issue when preparing to investigate #557.
Nevermind, false alarm. I was mixing the assembly-based register blocksizes with reference kernels, which I forgot have hard-coded blocksizes.
Have I ever mentioned that I'm not a fan of the reference kernels having hard-coded register blocksizes? 😐