How to implement blockwise fill?
void-main opened this issue · comments
Hi team, I'm writing a kernel that has the following requirement:
I have a range tensor with values: [0, 10, 20, 30]
and another range tensor with values: [0, 1, 2, 3, 4, 5, 6, 7]
, the expected tensor should be: [0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15, 16, 17, 20, 21, 22, 23, 24, 25, 26, 27, 30, 31, 32, 33, 34, 35, 36, 37]
.
I've tried the following code:
row_idxes = tl.arange(0, 4)
col_idxes = tl.arange(0, 8)
block_offs = row_idxes[:, None] * 10 + col_idxes[None, :] * 1
block_offs = tl.view(block_offs, [BLOCK_N])
Logically the code works, but the tl.view
operations seems to be not working as expected (like #2210 #2157 ), how could I fix this?