SymbioticLab / Salus

Fine-grained GPU sharing primitives

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ZrpcSession] Tensor Information not cleared between runs

Aetf opened this issue · comments

commented

This causes shape mismatch.

Step to reproduce

  1. start executor compile from zrpcsession branch
  2. run green -vvv test_tf/test_ops_tf.py
  3. run green -vvv test_tf/test_grad_tf.py (Note: step 2 and 3 can be exchanged with same outcome)

Expected

Tests pass

Actual

  • test_tf.test_grad_tf.TestOpGradients.test_grad_relu_scala failes due to shape mismatch if step 3 was run last
  • test_tf.test_ops_tf.TestBasicOps.test_variable failes due to shape mismatch if step 2 was run last

Note

Both tests pass if run individually, or if the executor is restarted between runs.

commented

Caused by devices being shared between sessions. Fixed by 3e2d8db