arrayfire / arrayfire-python

Python bindings for ArrayFire: A general purpose GPU library.

Home Page:https://arrayfire.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ParallelRange with Sliced Matrix

georgh opened this issue · comments

If you slice a matrix in the ParallelRange matmul does not detect the change in dimension and fails:

res = af.constant(0,20,20,1000 )    
left = af.constant(0, 20, 40)
right = af.constant(0, 40,20, 1000)

af.matmul(left, right[:,:,0]) # this works
for ii in af.ParallelRange(10):
            res[:,:,ii] = af.matmul(left, right[:,:,ii]) #will fail

@georgh This needs arrayfire/arrayfire#1898 to be merged.

But if that is merged, you can simply do res = af.matmul(left, right)

Are you sure?
left has a fixed size here, for me that fails (even with the patch applied)
Expected: lDims.ndims() == rDims.ndims()

But even if you add this to matmul, the problem seems not to be matmul but ParallelRange.
Another example: I would like to add to the diagonal part of all the matrices in my batch
Something like

for ii in af.ParallelRange(10):
               for kk in af.ParallelRange(10):
                   res[ii,ii,kk] += tmp[kk] 

would be cool. But tmp is a vector, so this will fail because I can't add a vector to the left side

@georgh You can't nest ParallelRange. But don't need ParallelRange for this. You can do the following:

tmp = af.moddims(tmp, 1, 1, tmp.shape[0])
res += af.tile(res.shape[0], res.shape[1]) # tile does not allocate additional memory

Ok its really nice to know that tile doesn't need more memory :)
But still, shouldn't

for kk in af.ParallelRange(10):
                   res[:,:,kk] += tmp[kk] 

work?

What are the sizes of res and tmp ?

Sorry my bad, beside the first case ParallelRange works fine.
But I still don't think the first case is a problem with matmul. Full example:

import arrayfire as af
af.set_backend('cuda')

res   = af.constant(0, 20,20,10)    
right = af.constant(0, 20,20,10)
fixed = af.constant(0, 20,20)

af.matmul(fixed, right[:,:,0]) # this works
for ii in af.ParallelRange(10):
      res[:,:,ii] = af.matmul(fixed, right[:,:,ii]) 
#this does not work even though it is only a multiplication of two matrices with the same size

I have #1898 merged in my build.
Error:

Traceback (most recent call last):
  File "simple.py", line 10, in <module>
    res[:,:,ii] = af.matmul(fixed, right[:,:,ii]) 
  File "/home/ghiero/anaconda3/envs/intel/lib/python3.6/site-packages/arrayfire/blas.py", line 57, in matmul
    lhs_opts.value, rhs_opts.value))
  File "/home/ghiero/anaconda3/envs/intel/lib/python3.6/site-packages/arrayfire/util.py", line 79, in safe_call
    raise RuntimeError(to_str(err_str))
RuntimeError: In function af_err af_matmul(void**, af_array, af_array, af_mat_prop, af_mat_prop)
In file src/api/c/blas.cpp:131
Invalid dimension for argument 1
Expected: lDims.ndims() == rDims.ndims()

Or even simpler:

import arrayfire as af
af.set_backend('cuda')

res  = af.constant(0, 20,20,10)  
xx = af.constant(0, 20,20)
yy = af.constant(0, 20,20)

for ii in af.ParallelRange(10):
      res[:,:,ii] = af.matmul(xx, yy)
s = af.sum(res)
print("work finished, got {}".format(s))