SymbioticLab / Salus

Fine-grained GPU sharing primitives

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Executor running kernels on GPU does not produce correct number

Aetf opened this issue · comments

commented

Step to reproduce

  1. export EXEC_SCHED_USE_GPU=1 and run executor
  2. run unittests test_ops_tf.py

Expected

Tests pass

Actual

Unit tests failed with random value (changing each time).

Failed tests are

TestBasicOps.test_conv2d
TestBasicOps.test_matmul
TestBasicOps.test_randomop                                                                                                                                            
TestBasicOps.test_relu_0                                                                                                                                              
TestBasicOps.test_relu_1                                                                                                                                              
TestBasicOps.test_relu_2                                                                                                                                              
TestBasicOps.test_relu_3                                                                                                                                              
TestBasicOps.test_relu_4                                                                                                                                              
TestBasicOps.test_variable
commented

Actually caused by wrong arguments were passed to MemoryMgr::alignedAlloc. Thus at some point there's a write-out-of-bound issue. Fixed in c1ea383