apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Home Page:https://mxnet.apache.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segment fault when calculating data located at different GPU

rubbberrabbit opened this issue · comments

Description

Segment fault when calculating data located at different GPU. Most of the time, manipulating data located on different devices will give clear exceptions to tell users to deep copy data to the same device before manipulating. But I found that simply adding two ndarray data at different GPU will just cause segment fault, which may indicate the add operator can not handle the condition well nor give a reasonable exception report.

Error Message

Segment Fault

To Reproduce

I run this script on MxNet1.9.1 with two RTX3090 GPU.
from mxnet import np,npx
import mxnet.gluon.nn as nn
npx.set_np()
X = np.ones((1, 10),ctx=npx.gpu(0))
Y = np.ones((1, 10),ctx=npx.gpu(1))
C = X + Y
print(C)

Steps to reproduce

(Paste the commands you ran that produced the error.)

run the code script

What have you tried to solve it?

Environment

MxNet1.9.1 CUDA11.2 with two RTX3090 GPU.