OAID / Caffe-HRT

Heterogeneous Run Time version of Caffe. Added heterogeneous capabilities to the Caffe, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original Caffe architecture which users deploy their applications seamlessly.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Open CL SVM (shared virtual memory) support in CaffeOnACL

msuryadeekshith opened this issue · comments

Issue summary

Hello folks,

I'm using CaffeonACL with Arm Compute Library (version 17.10). FYI, I am using mobile GPU hardware which supports [SVM],
[1] https://community.arm.com/processors/b/blog/posts/exploring-how-cache-coherency-accelerates-heterogeneous-compute,

And also confirmed about OpenCL driver capabilities using clGetDeviceInfo OpenCL 2.0 API function passing CL_DEVICE_SVM_CAPABILITIES constant . The level of SVM support is returned as coarse-gained SVM capability support which means its H/W I/O Coherency with coarse-grained SVM as per [1].

To experiment that, I have done the following settings,

  • useHostPtr set true in ComputeLibrary (CL2.hpp)
  • share variable set true in CaffeOnACL tensor_mem functions (acl_layer.cpp)
    => Note: the default settings are changed from false to true.

But Surprisingly, the output is wrong when I benchmark Alexnet for classification with the above changes. However, the output is proper when i use the default settings (no SVM support) as well as when I by-pass all ACL layers.

Now, My query is whether the above settings will be good enough to enable SVM support
or
Do I need to make any additional changes?

I had already queried the same here ARM-software/ComputeLibrary#271 .Still waiting for some help.

system configuration

Operating system: Linux
Compiler: gcc version 5.4.0 20160609
CUDA version (if applicable): NA
CUDNN version (if applicable): NA

Thank you,
Surya Deekshith.