Support Pipelined design
AndreasKaratzas opened this issue · comments
Andreas Karatzas commented
Consider the following scenario:
I have a neural network, let's say AlexNet. I break the bigger model into 2 sub-network, one with the convolutional kernels and the second with the fully connected layers. I save both sub-networks in 2 ONNX files. I have a SBC (like Odroid N2+) with both ARM CPU and GPU.
The question is, can I use your framework to run the first sub-network on the CPU and the other on the GPU using memory copy?
Example:
input = input.device(cpu)
out1 = run(subnet1, input).device(cpu)
temp_out1 = copy(out1).device(gpu)
out2 = run(subnet2, temp_out1).device(gpu)