Parallel sum results 0 when reducing a vector of size 1024 * 1024
Red-Portal opened this issue · comments
Kyurae Kim commented
very weird.
It somewhat works when the size is 1024 but fail if it's 1024 * 1024
Kyurae Kim commented
Solved by changing the parallel reduction algorithm