CNugteren / CLBlast

Tuned OpenCL BLAS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CL kernel preprocess cause compilation error

handsome-fu opened this issue · comments

Test platform:
Odroidn2 with ARM Mali G52, 6.1.22 aarch64 GNU/Linux
Error Log:
image
Root cause:
clblast enable CL kernel preprocess while it detect ARM gpu(https://github.com/CNugteren/CLBlast/blob/master/src/utilities/compile.cpp#L109), but this perprocessor remove some function definition from original source, so GPU driver will report implicit declaration error.
Please see my attach files, definition of GlobalToLocalCheckedImage has been removed.
before_preprocess.txt
after_preprocess.txt

Thanks for reporting the issue. I'll see if I can improve the preprocessor. It will take a week or three to find some time for this.

Thanks again for reporting the issue and supplying lots of information to debug the issue. It turns out indeed there is a bug in the pre-processor: it can't handle the !defined(...) construct. I've fixed that now, and extended the test to also cover this situation, see: #498. Perhaps you can already try on that branch?

Note that if you compile the tests, you can also run the preprocessor tests separately like so (e.g. with make on Linux):

make -j4 clblast_test_preprocessor && ./clblast_test_preprocessor