Option for having output tensors allocated in device memory?
jkrause1 opened this issue · comments
Hello,
I'm loading a model of a frozen graph and run it. I then check for the device of the resulting output-tensors and they all return to me
/job:localhost/replica:0/task:0/device:CPU:0
implying they reside in the host memory. I don't know if this is a result of how the graph is constructed, or if there are options missing I have to set, but I would prefer if they stay in the device memory, so I can access and process the data further via CUDA.
Hi ,
were you able to solve this ? I faced problem for a related task, meaning loading frozen_model - In my case , the error I got is terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
I can see it has something to do with memory. When I was loading the original model , there was no issue at all. I faced this only after trying the frozen model. The two model are nearly same in size but there could be difference in structure of the graph though which I didn't check.
Here is how I load my frozen model :
cppflow::model model("Froozen_model_dir", cppflow::model::TYPE::FROZEN_GRAPH);
and here is how I call inference on it with sampple input
output_tensor = model(input_1);
and I got this :
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
Any tip on how to solve this