NVIDIA / MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Home Page:https://nvidia.github.io/MatX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QST] After running some computations on a tensor, how to fetch the entire tensor back to host memory?

cenwangumass opened this issue · comments

Let's say we have a tensor t, and I run some computations on t: (t = 2 * t).run(). Now, how should I get the t into a C++ std::vector or an Eigen vector? Thanks.

Hi @cenwangumass, it depends on how you allocate the tensor. If t was allocated using the standard way:

auto t = make_tensor<float>({10});

For example, then the default is to use managed memory. Managed memory is available on both host and device with the same pointer. You could them do something like:

std::vector<float> vec(10);
auto t = make_tensor<float>({10});
// Do some work on t
memcpy(vec.data(), t.Data(), t.Bytes());

// Or
for (int i = 0; i < vec.size(); i++) {
   vec(i) = t(i);
}

The same thing applies to an Eigen vector/tensor. Keep in mind that this will only work as above is the tensor is contiguous. If you have strided data (not compact) then you will need to either make a copy into a contiguous tensor first, or manually copy each element. Does that make sense?

Thanks! I was already doing that. I was just looking for maybe a function or a method in matx.

Technically you can also create a tensor and point it at vector memory:

std::vector<float> vec(10);
auto t2 = make_tensor<float>(vec.data(), {10});
auto t = make_tensor<float>({10});
(t2 = t).run(matx::HostExecutor{});

This should work too if you prefer that method.