Problem with running gpuR on Mac Pro

Question

Problem with running gpuR on Mac Pro

liminfang opened this issue 6 years ago · comments

Summary of the problem:
The gpuR on Mac Pro is very slow and gpuInfo() returns an error message saying that "No GPUs found in context"

Detailed Description:
I have successfully installed gpuR on my Mac Pro, and loading the library gives the following information:

Number of platforms: 1

platform: Apple: OpenCL 1.2 (May 24 2018 20:07:03)

context device index: 0

Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz

context device index: 1

AMD Radeon HD - FirePro D500 Compute Engine

context device index: 2

AMD Radeon HD - FirePro D500 Compute Engine
checked all devices
completed initialization
gpuR 2.0.0
Attaching package: ‘gpuR’
The following objects are masked from ‘package:base’:
colnames, pmax, pmin, svd

However, when I type in gpuInfo(), R console returns an error message:

Error in gpuInfo() : No GPUs found in context

Also I run the following script:

library(tictoc)
ORDER = 4000
A = matrix(rnorm(ORDER^2), nrow=ORDER)
B = matrix(rnorm(ORDER^2), nrow=ORDER)
gpuA = gpuMatrix(A, type="double")
gpuB = gpuMatrix(B, type="double")

tic()
C = A %*% B
toc()

tic()
gpuC = gpuA %*% gpuB
toc()

The cpu time for multiplying the matrices is 45.9 seconds, but the gpu time is 203.304 seconds. This does not make sense. Isn't GPU supposed to be a lot faster? Please help.

Charles Determan · Answer 1 · Wed Oct 17 2018 22:55:33 GMT+0800 (China Standard Time)

@liminfang sorry for slow response, I have been quite busy lately. Regarding the failure of gpuInfo can you provide the output of listContexts()? It looks like your default context is on the CPU (i.e. your Intel Xeon). As such, there are no GPUs in that context.

If my assumption is correction, then cpuInfo would work for you by default and provide information about your CPU. If that is the case, that would explain the slower performance as we are essentially doing more work than if you were just doing it in R. I would like to improve the CPU performance but that is not the main goal.

Back to your concern, you could simply change the default context or pass the context id to your matrices when you create them either with

setContext(2L)

or

gpuA <- gpuMatrix(A, ctx_id = 2L, type = "double")

liminfang · Answer 2 · Thu Oct 18 2018 04:58:00 GMT+0800 (China Standard Time)

listContexts() returns:

listContexts()

context platform platform_index device device_index device_type
1 1 Apple: OpenCL 1.2 (May 24 2018 20:07:03) 0 Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz 0 cpu
2 2 Apple: OpenCL 1.2 (May 24 2018 20:07:03) 0 AMD Radeon HD - FirePro D500 Compute Engine 0 gpu
3 3 Apple: OpenCL 1.2 (May 24 2018 20:07:03) 0 AMD Radeon HD - FirePro D500 Compute Engine 0 gpu

liminfang · Answer 3 · Thu Oct 18 2018 04:59:45 GMT+0800 (China Standard Time)

@cdeterman Yes, you are right: if I use
gpuA <- gpuMatrix(A, ctx_id = 2L, type = "double")

It solves the problem nicely. Now the gpu matrix multiplication from my original code takes only 2.8 seconds instead of 200 seconds.