CNugteren / CLBlast

Tuned OpenCL BLAS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Result of AMAX in different format to CBLAS

BadgerKing7 opened this issue · comments

The CLBlast AMAX routine computes the correct result of the operation, but returns it in the format of <resulted_index> * incX compared to other CBLAS implementations which return <resulted_index> only.

In the following examples, CLBlast has been tested against Intel's Math Kernel Library:
For float x = {1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6} with incX = 2,
CLBlast result = 26
MKL result = 13

For float-complex x = {(2.3,4.5), (2.4,4.6), (2.5,4.7), (2.6,4.8), (2.7,4.9), (2.8,5), (2.9,5.1), (3,5.2), (3.1,5.3), (3.2,5.4), (3.3,5.5), (3.4,5.6), (3.5,5.7)} with incX = 2,
CLBlast result = 12
MKL result = 6

Thank you for reporting this issue.

Can you also reproduce this when building and running the clblast_test_xamax program? Because if I compare it against both AMD's clBlas and OpenBLAS (a CBLAS implementation) on my system then I see no errors. And same for the tests that run on CI. So either the test coverage is not 100%, or the 'issue' is with the MKL library and not with other CBLAS libraries. Could you run this program yourself against both MKL and another CBLAS and see if the tests fail on your system?

The tests included in the clblast_test_xamax executable are not failing on my system, but after some digging, I now believe the tests are not checking the correct things for this operation.

In my understanding, the final result of the AMAX routine should always be a single unsigned integer representing the index of the first element with the highest absolute value found in the input vector, regardless of the data type of the values in the input vector.
image link to lines
These two variables that I think should contain the results of both the CLBlast AMAX operation and the reference AMAX operation are always vectors of multiple float/double/float-complex/double-complex/half elements, but never integers, so comparing them does not make much sense to me, but this may also be due to my poor comprehension of CLBlast's testing framework so maybe it is nothing.

However, I've put together some changes in this commit to highlight the difference in result format between CLBlast and the reference CBLAS implementations. Please take a look at them and run the test executable with these changes to see that even though the tests run pass, the results displayed in the console are never equal for increments higher than 1. I have run CLBlast against MKL and OpenBLAS with no differences in behavior.

On my system, the results were sometimes not equal between CLBlast and MKL/OpenBLAS even for increments of 1, which would indicate a problem different to the one I initially reported, but I am not yet sure whether this is due to my environment or something else.

First of all apologies for the late reply.

You are right about the CLBlast test infrastructure that it is floating-point focused. However, I believe (although it has been a while), that it does support integer result comparisons in some hacky way. Some evidence for this is:

I will try to have a more detailed look soon and see where the bug in the testing is.

I found some time and indeed you are right: integer-output testing does not work. I expected that treating everything as float would work as long as the sizes where fine, but apparently not. I made some changes to the test infrastructure here (and fixed the increment bug you reported): https://github.com/CNugteren/CLBlast/compare/xamax_integer_testing_bug_fixes?compare=1
I'll double check everything and test more later and I'll make a PR then.

Again, thanks for reporting this and digging into the problem!

Again sorry for all the delays. I think I managed to fix the integer testing and as a result uncovered and fixed two bugs (the one you reported here and another one). See the PR #457.

Please let me know if this fixes the issue.

Thanks again 👏

I have just tested the change and it fixes the issue. Thank you very much!