flame / blis

BLAS-like Library Instantiation Software Framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the best way to debug BLIS?

dmikushin opened this issue · comments

Dear Developers!

I've been looking around in BLIS for a week and came to a conclusion that the codebase is being developed by brilliant people. I don't have any other explanation to the fact that the BLIS code is absolutely impossible to debug. The heavy use of functional macros does not allow gdb to step into almost any meaningful line of code. So I conclude the issue analysis with a debugger is discouraged in BLIS. What alternative debug methods would you recommend for regular engineers like me, who make many mistakes?

For example, I'm debugging BLIS 751d0a1 in Ubuntu 22.04 :

Thread 1 "test_gemmd" received signal SIGBUS, Bus error.
0x00007ffff72b427c in bao_zpackm_cxk (conja=BLIS_NO_CONJUGATE, schema=BLIS_PACKED_COL_PANELS, panel_dim=4, panel_dim_max=4, panel_len=128, panel_len_max=128, kappa=0x7fffffffbb10, d=0x5555555904b0, incd=1, a=0x7ffff7f9a010, inca=1, lda=128, p=0x7fff7b5be000, ldp=4, cntx=0x555555590ce0) at .../blis/addon/gemmd/bao_packm_cxk.c:314
314							bli_zzzscal2s( *ali, *dl, *pli );
(gdb) disass
Dump of assembler code for function bao_zpackm_cxk:
...
   0x00007ffff72b426a <+1174>:	mov    0x40(%rbp),%rax
   0x00007ffff72b426e <+1178>:	add    %rdx,%rax
   0x00007ffff72b4271 <+1181>:	mov    %rax,-0x80(%rbp)
   0x00007ffff72b4275 <+1185>:	mov    -0x90(%rbp),%rax
=> 0x00007ffff72b427c <+1192>:	movsd  (%rax),%xmm1
   0x00007ffff72b4280 <+1196>:	mov    -0x88(%rbp),%rax
   0x00007ffff72b4287 <+1203>:	movsd  (%rax),%xmm0

Does not happen in Release mode, but valgrind spots "0 bytes after a block of size 131,072 alloc'd". However, vectorized out-of-range reads are valid sometimes. Don't know, as I said: without debugging I'm blind.

Am I running into something similar to #550 ?

I am not one of the developers, although I have worked with them for many years. They are brilliant, but there are techniques that can be shared.

May I suggest you join the BLIS Discord server https://github.com/flame/blis/blob/master/docs/Discord.md, maybe on the "general" channel and pose the question there, since it could become a broader discussion?