arrayfire / arrayfire-r

R wrapper for ArrayFire

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Port all functions from ArrayFire 3.4

pavanky opened this issue · comments

Each header "foo" file should have the following:

  • src/foo.cpp
  • R/foo.R

Also add tests if possible, though they are not necessary

  • test/simple_foo.R
  • algorithm
  • arith
  • array
  • blas
  • data
  • defines (library.py)
  • device
  • features
  • graphics
  • image
  • index
  • lapack
  • signal
  • statistics
  • util
  • version (merge with util)
  • vision

C++ only headers that are not being ported

  • exception.h (ported the necessary C function to util.py)
  • complex.h
  • dim4.hpp
  • gfor (are there any necessary bits??)
  • constants

Hi there,
I would like to give it try to arrayfire-r, but I have some errors
I would appreciate very much some help with this.
My overall computing skills are pretty basic but I'm willing ;)

This is what I achieved.

Installed Ubuntu 14.04 with AMD Catalyst 15.302 in a Dell7910 with 2 Xeon E5-2609 v3, 32Gb RAM, Fury tri-X R9 GPU
No CUDA, no NVIDIA drives
Tested GPUrat and the benchmarks ran ok
Installed Arrayfire dependencies, compilers etc, but I have some problem with GLFW 3.2 GLX, apparently it compiled from source ok.
Installed Arrayfire 3.3.2
Installed Anaconda and other dependencies
Compiled & ran arrayfire benchmark ok
The CPU benchmarks' run is very slow, only uses one core!

Compiled & run arrayfire examples, but it runtime complained.
For example

./blas_opencl
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
ArrayFire v3.3.2 (OpenCL, 64-bit Linux, build f65dd97)
[0] AMD : Fiji, 3655 MB
-1- AMD : Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz, 32099 MB
Benchmark N-by-N matrix multiply
128 x 128: 257 Gflops
256 x 256: 779 Gflops
384 x 384: 1199 Gflops
512 x 512: 1605 Gflops
640 x 640: 2624 Gflops
768 x 768: 2822 Gflops
896 x 896: 3555 Gflops
1024 x 1024: 3204 Gflops
1152 x 1152: 3708 Gflops
1280 x 1280: 3968 Gflops
1408 x 1408: 3961 Gflops
1536 x 1536: 3947 Gflops
1664 x 1664: 4365 Gflops
1792 x 1792: 4434 Gflops
1920 x 1920: 4087 Gflops
2048 x 2048: 4352 Gflops

peak 4434.43 GFLOPS

Installed Revolutions R with MKL
Installed arrayfire-r without errors
( library("arrayfire", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.2")) seems ok
running the arrayfire-r example it complains and the print result is similar to the piR(num) CPU function:

res = piAF(num)
GLX: Forward compatibility requested but GLX_ARB_create_context_profile is unavailableError: Could not Create GLFW Window!
Error in afRunif(num) : In function random
In file src/backend/opencl/kernel/random.hpp:142
OpenCL Error (-11): Build Program Failure when calling clBuildProgram
print(res)
[1] 3.141268

Cheers

Aurelio

@AurelioG I see that the C/C++ version is using 3.3.2 where as you are loading 3.2 for R. Can you try with 3.3 ?

I just ran the same code on an R9 Fury nano and did not get any errors.

If that doesn't work, can you post the output of clinfo over here?

No Joy yet.. I tried 3.0 and could not install properly. I tried different things starting from clean install and probably glfw is a source of headache, arrayfire binaries or compile is another flavour of problems. Here is what works with arrayfire-r

> afInfo()
ArrayFire v3.4.0 (OpenCL, 64-bit Linux, build 4d625e7)
[0] AMD     : Fiji, 3784 MB
-1- AMD     : Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz, 32099 MB
> a <- c(1:100)
> b <- array(a)
> c<-afArray(b)
> f<-afArray(b,type = "single")
> aa <- f*f + c*c
> afPrint(aa)
No Name Array
[100 1 1 1]
Internal Error: finding libraries failed!
Linker phase failed compilation.
Error: Compilation from LLVMIR binary to IL text failed!
 -D dim_t=long
Error in afPrint(aa) : In function print
In file src/api/c/print.cpp:89


> a <-1.4
> a <-aa
> afPrint(a)
No Name Array
[100 1 1 1]
/tmp/OCL4493T26.cl:249:12: error: use of undeclared identifier 'val6'; did you mean 'all'?
out[idx] = val6;
           ^~~~
           all
/app/workarea/15.302/stream/opencl/compiler/clc2/ocl-headers/build/lnx64a/B_rel/opencl12_builtins.h:3125:46: note: 'all' declared here
int const_func __attribute__((overloadable)) all(long16 x);
                                             ^
/tmp/OCL4493T26.cl:249:12: error: taking address of function is not allowed
out[idx] = val6;
           ^

error: Clang front-end compilation failed!
Frontend phase failed compilation.

 -D dim_t=long
Error in afPrint(a) : In function print
In file src/api/c/print.cpp:89

This is my Arrayfire manifest

/usr/local/include/arrayfire.h
/usr/local/include/af/index.h
/usr/local/include/af/array.h
/usr/local/include/af/device.h
/usr/local/include/af/statistics.h
/usr/local/include/af/dim4.hpp
/usr/local/include/af/arith.h
/usr/local/include/af/gfor.h
/usr/local/include/af/image.h
/usr/local/include/af/algorithm.h
/usr/local/include/af/macros.h
/usr/local/include/af/graphics.h
/usr/local/include/af/complex.h
/usr/local/include/af/exception.h
/usr/local/include/af/lapack.h
/usr/local/include/af/backend.h
/usr/local/include/af/compatible.h
/usr/local/include/af/signal.h
/usr/local/include/af/traits.hpp
/usr/local/include/af/util.h
/usr/local/include/af/blas.h
/usr/local/include/af/internal.h
/usr/local/include/af/opencl.h
/usr/local/include/af/vision.h
/usr/local/include/af/cuda.h
/usr/local/include/af/data.h
/usr/local/include/af/features.h
/usr/local/include/af/timing.h
/usr/local/include/af/defines.h
/usr/local/include/af/version.h
/usr/local/include/af/seq.h
/usr/local/include/af/constants.h
/usr/local/include/af/version.h
/usr/local/lib/libforge.so
/usr/local/share/ArrayFire/cmake/ArrayFireConfig.cmake
/usr/local/share/ArrayFire/cmake/ArrayFireConfigVersion.cmake
/usr/local/share/ArrayFire/examples/CMakeModules/FindOpenCL.cmake
/usr/local/share/ArrayFire/examples/graphics/fractal.cpp
/usr/local/share/ArrayFire/examples/graphics/conway.cpp
/usr/local/share/ArrayFire/examples/graphics/histogram.cpp
/usr/local/share/ArrayFire/examples/graphics/gravity_sim.cpp
/usr/local/share/ArrayFire/examples/graphics/plot3.cpp
/usr/local/share/ArrayFire/examples/graphics/gravity_sim_init.h
/usr/local/share/ArrayFire/examples/graphics/surface.cpp
/usr/local/share/ArrayFire/examples/graphics/plot2d.cpp
/usr/local/share/ArrayFire/examples/graphics/conway_pretty.cpp
/usr/local/share/ArrayFire/examples/computer_vision/susan.cpp
/usr/local/share/ArrayFire/examples/computer_vision/matching.cpp
/usr/local/share/ArrayFire/examples/computer_vision/harris.cpp
/usr/local/share/ArrayFire/examples/computer_vision/fast.cpp
/usr/local/share/ArrayFire/examples/lin_algebra/qr.cpp
/usr/local/share/ArrayFire/examples/lin_algebra/cholesky.cpp
/usr/local/share/ArrayFire/examples/lin_algebra/lu.cpp
/usr/local/share/ArrayFire/examples/lin_algebra/svd.cpp
/usr/local/share/ArrayFire/examples/CMakeLists.txt
/usr/local/share/ArrayFire/examples/helloworld/helloworld.cpp
/usr/local/share/ArrayFire/examples/getting_started/rainfall.cpp
/usr/local/share/ArrayFire/examples/getting_started/integer.cpp
/usr/local/share/ArrayFire/examples/getting_started/vectorize.cpp
/usr/local/share/ArrayFire/examples/getting_started/convolve.cpp
/usr/local/share/ArrayFire/examples/image_processing/edge.cpp
/usr/local/share/ArrayFire/examples/image_processing/morphing.cpp
/usr/local/share/ArrayFire/examples/image_processing/binary_thresholding.cpp
/usr/local/share/ArrayFire/examples/image_processing/brain_segmentation.cpp
/usr/local/share/ArrayFire/examples/image_processing/filters.cpp
/usr/local/share/ArrayFire/examples/image_processing/adaptive_thresholding.cpp
/usr/local/share/ArrayFire/examples/image_processing/image_editing.cpp
/usr/local/share/ArrayFire/examples/image_processing/pyramids.cpp
/usr/local/share/ArrayFire/examples/image_processing/optical_flow.cpp
/usr/local/share/ArrayFire/examples/image_processing/image_demo.cpp
/usr/local/share/ArrayFire/examples/unified/basic.cpp
/usr/local/share/ArrayFire/examples/pde/swe.cpp
/usr/local/share/ArrayFire/examples/common/idxio.h
/usr/local/share/ArrayFire/examples/common/progress.h
/usr/local/share/ArrayFire/examples/benchmarks/fft.cpp
/usr/local/share/ArrayFire/examples/benchmarks/blas.cpp
/usr/local/share/ArrayFire/examples/benchmarks/pi.cpp
/usr/local/share/ArrayFire/examples/financial/black_scholes_options.cpp
/usr/local/share/ArrayFire/examples/financial/heston_model.cpp
/usr/local/share/ArrayFire/examples/financial/monte_carlo_options.cpp
/usr/local/share/ArrayFire/examples/financial/input.h
/usr/local/share/ArrayFire/examples/README.md
/usr/local/share/ArrayFire/examples/machine_learning/perceptron.cpp
/usr/local/share/ArrayFire/examples/machine_learning/mnist_common.h
/usr/local/share/ArrayFire/examples/machine_learning/deep_belief_net.cpp
/usr/local/share/ArrayFire/examples/machine_learning/logistic_regression.cpp
/usr/local/share/ArrayFire/examples/machine_learning/softmax_regression.cpp
/usr/local/share/ArrayFire/examples/machine_learning/neural_network.cpp
/usr/local/share/ArrayFire/examples/machine_learning/naive_bayes.cpp
/usr/local/share/ArrayFire/examples/machine_learning/rbm.cpp
/usr/local/share/ArrayFire/examples/machine_learning/kmeans.cpp
/usr/local/share/ArrayFire/examples/machine_learning/knn.cpp
/usr/local/share/ArrayFire/examples/machine_learning/bagging.cpp
/usr/local/share/ArrayFire/examples/assets/examples/data/mnist/images-subset
/usr/local/share/ArrayFire/examples/assets/examples/data/mnist/labels-subset
/usr/local/share/ArrayFire/examples/assets/examples/images/spider.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/house.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/atlantis.png
/usr/local/share/ArrayFire/examples/assets/examples/images/circle_center.ppm
/usr/local/share/ArrayFire/examples/assets/examples/images/arrow.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/trees_ctm.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/fight.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/man.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/circle_left.ppm
/usr/local/share/ArrayFire/examples/assets/examples/images/noisy_square.png
/usr/local/share/ArrayFire/examples/assets/examples/images/square.png
/usr/local/share/ArrayFire/examples/assets/examples/images/vegetable-woman.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/nature.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/sudoku.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/README.md
/usr/local/share/ArrayFire/examples/assets/examples/images/bimodal.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/sunset_emp.jpg
/usr/local/share/ArrayFire/examples/assets/examples/images/brain.png
/usr/local/lib/libafcpu.so.3.4.0
/usr/local/lib/libafcpu.so.3
/usr/local/lib/libafcpu.so
/usr/local/share/ArrayFire/cmake/ArrayFireCPU.cmake
/usr/local/share/ArrayFire/cmake/ArrayFireCPU-release.cmake
/usr/local/lib/libafopencl.so.3.4.0
/usr/local/lib/libafopencl.so.3
/usr/local/lib/libafopencl.so
/usr/local/share/ArrayFire/cmake/ArrayFireOpenCL.cmake
/usr/local/share/ArrayFire/cmake/ArrayFireOpenCL-release.cmake
/usr/local/lib/libaf.so.3.4.0
/usr/local/lib/libaf.so.3
/usr/local/lib/libaf.so
/usr/local/share/ArrayFire/cmake/ArrayFireUnified.cmake
/usr/local/share/ArrayFire/cmake/ArrayFireUnified-release.cmake

This is my CLinfo

  Number of platforms:               1
  Platform Profile:              FULL_PROFILE
  Platform Version:              OpenCL 2.0 AMD-APP (1912.5)
  Platform Name:                 AMD Accelerated Parallel Processing
  Platform Vendor:               Advanced Micro Devices, Inc.
  Platform Extensions:               cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 


  Platform Name:                 AMD Accelerated Parallel Processing
Number of devices:               2
  Device Type:                   CL_DEVICE_TYPE_GPU
  Vendor ID:                     1002h
  Board name:                    AMD Radeon (TM) R9 Fury Series  
  Device Topology:               PCI[ B#3, D#0, F#0 ]
  Max compute units:                 56
  Max work items dimensions:             3
    Max work items[0]:               256
    Max work items[1]:               256
    Max work items[2]:               256
  Max work group size:               256
  Preferred vector width char:           4
  Preferred vector width short:          2
  Preferred vector width int:            1
  Preferred vector width long:           1
  Preferred vector width float:          1
  Preferred vector width double:         1
  Native vector width char:          4
  Native vector width short:             2
  Native vector width int:           1
  Native vector width long:          1
  Native vector width float:             1
  Native vector width double:            1
  Max clock frequency:               1000Mhz
  Address bits:                  64
  Max memory allocation:             2733502080
  Image support:                 Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      64
  Max image 2D width:                16384
  Max image 2D height:               16384
  Max image 3D width:                2048
  Max image 3D height:               2048
  Max image 3D depth:                2048
  Max samplers within kernel:            16
  Max size of kernel argument:           1024
  Alignment (bits) of base address:      2048
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                     No
    Quiet NaNs:                  Yes
    Round to nearest even:           Yes
    Round to zero:               Yes
    Round to +ve and infinity:           Yes
    IEEE754-2008 fused multiply-add:         Yes
  Cache type:                    Read/Write
  Cache line size:               64
  Cache size:                    16384
  Global memory size:                3912305728
  Constant buffer size:              65536
  Max number of constant args:           8
  Local memory type:                 Scratchpad
  Local memory size:                 32768
  Max pipe arguments:                16
  Max pipe active reservations:          16
  Max pipe packet size:              2733502080
  Max global variable size:          2460151808
  Max global variable preferred total size:  3912305728
  Max read/write image args:             64
  Max on device events:              1024
  Queue on device max size:          8388608
  Max on device queues:              1
  Queue on device preferred size:        262144
  SVM capabilities:              
    Coarse grain buffer:             Yes
    Fine grain buffer:               Yes
    Fine grain system:               No
    Atomics:                     No
  Preferred platform atomic alignment:       0
  Preferred global atomic alignment:         0
  Preferred local atomic alignment:      0
  Kernel Preferred work group size multiple:     64
  Error correction support:          0
  Unified memory for Host and Device:        0
  Profiling timer resolution:            1
  Device endianess:              Little
  Available:                     Yes
  Compiler available:                Yes
  Execution capabilities:                
    Execute OpenCL kernels:          Yes
    Execute native function:             No
  Queue on Host properties:              
    Out-of-Order:                No
    Profiling :                  Yes
  Queue on Device properties:                
    Out-of-Order:                Yes
    Profiling :                  Yes
  Platform ID:                   0x7f6c3f8b6a18
  Name:                      Fiji
  Vendor:                    Advanced Micro Devices, Inc.
  Device OpenCL C version:           OpenCL C 2.0 
  Driver version:                1912.5 (VM)
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 2.0 AMD-APP (1912.5)
  Extensions:                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_gl_depth_images cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes 


  Device Type:                   CL_DEVICE_TYPE_CPU
  Vendor ID:                     1002h
  Board name:                    
  Max compute units:                 12
  Max work items dimensions:             3
    Max work items[0]:               1024
    Max work items[1]:               1024
    Max work items[2]:               1024
  Max work group size:               1024
  Preferred vector width char:           16
  Preferred vector width short:          8
  Preferred vector width int:            4
  Preferred vector width long:           2
  Preferred vector width float:          8
  Preferred vector width double:         4
  Native vector width char:          16
  Native vector width short:             8
  Native vector width int:           4
  Native vector width long:          2
  Native vector width float:             8
  Native vector width double:            4
  Max clock frequency:               1200Mhz
  Address bits:                  64
  Max memory allocation:             8414708736
  Image support:                 Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      64
  Max image 2D width:                8192
  Max image 2D height:               8192
  Max image 3D width:                2048
  Max image 3D height:               2048
  Max image 3D depth:                2048
  Max samplers within kernel:            16
  Max size of kernel argument:           4096
  Alignment (bits) of base address:      1024
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                     Yes
    Quiet NaNs:                  Yes
    Round to nearest even:           Yes
    Round to zero:               Yes
    Round to +ve and infinity:           Yes
    IEEE754-2008 fused multiply-add:         Yes
  Cache type:                    Read/Write
  Cache line size:               64
  Cache size:                    32768
  Global memory size:                33658834944
  Constant buffer size:              65536
  Max number of constant args:           8
  Local memory type:                 Global
  Local memory size:                 32768
  Max pipe arguments:                16
  Max pipe active reservations:          16
  Max pipe packet size:              4119741440
  Max global variable size:          1879048192
  Max global variable preferred total size:  1879048192
  Max read/write image args:             64
  Max on device events:              0
  Queue on device max size:          0
  Max on device queues:              0
  Queue on device preferred size:        0
  SVM capabilities:              
    Coarse grain buffer:             No
    Fine grain buffer:               No
    Fine grain system:               No
    Atomics:                     No
  Preferred platform atomic alignment:       0
  Preferred global atomic alignment:         0
  Preferred local atomic alignment:      0
  Kernel Preferred work group size multiple:     1
  Error correction support:          0
  Unified memory for Host and Device:        1
  Profiling timer resolution:            1
  Device endianess:              Little
  Available:                     Yes
  Compiler available:                Yes
  Execution capabilities:                
    Execute OpenCL kernels:          Yes
    Execute native function:             Yes
  Queue on Host properties:              
    Out-of-Order:                No
    Profiling :                  Yes
  Queue on Device properties:                
    Out-of-Order:                No
    Profiling :                  No
  Platform ID:                   0x7f6c3f8b6a18
  Name:                      Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
  Vendor:                    GenuineIntel
  Device OpenCL C version:           OpenCL C 1.2 
  Driver version:                1912.5 (sse2,avx)
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 1.2 AMD-APP (1912.5)
  Extensions:                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event

@AurelioG That's weird. I tried the same code on my machine without any issues..

> library(arrayfire)

Attaching package: ‘arrayfire’

The following objects are masked from ‘package:stats’:

    cov, median, sd, var

> afInfo()
ArrayFire v3.4.0 (OpenCL, 64-bit Linux, build 3dbb7a9)
[0] AMD     : Fiji, 3514 MB
-1- AMD     : Spectre, 2275 MB
-2- AMD     : AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G, 15052 MB
> a <- c(1:100)
> b <- array(a)
> c <- afArray(b)
> f <- afArray(b, type='single')
> aa <- f*f + c*c
> afPrint(aa)
No Name Array
[100 1 1 1]
         2 
         8 
        18 
        32 
        50 
        72 
        98 
       128 
       162 
       200 
       242 
       288 
       338 
       392 
       450 
       512 
       578 
       648 
       722 
       800 
       882 
       968 
      1058 
      1152 
      1250 
      1352 
      1458 
      1568 
      1682 
      1800 
      1922 
      2048 
      2178 
      2312 
      2450 
      2592 
      2738 
      2888 
      3042 
      3200 
      3362 
      3528 
      3698 
      3872 
      4050 
      4232 
      4418 
      4608 
      4802 
      5000 
      5202 
      5408 
      5618 
      5832 
      6050 
      6272 
      6498 
      6728 
      6962 
      7200 
      7442 
      7688 
      7938 
      8192 
      8450 
      8712 
      8978 
      9248 
      9522 
      9800 
     10082 
     10368 
     10658 
     10952 
     11250 
     11552 
     11858 
     12168 
     12482 
     12800 
     13122 
     13448 
     13778 
     14112 
     14450 
     14792 
     15138 
     15488 
     15842 
     16200 
     16562 
     16928 
     17298 
     17672 
     18050 
     18432 
     18818 
     19208 
     19602 
     20000 

The error you are seeing seems to be coming from the OpenCL driver..

Apparently this can happen if you have an older version of AMDAPPSDK installed along side the newer version. Can you check if that is the case?

Here is the R output with a clean Ubuntu 14.04 install, but my R snippet didn't work. However the errors are different. The arrayfire examples were OK, except for the graphics. This time I didn't install OpenCL GL headers, mesa or other stuff that could interfere with the AMD SDK.

So my question is ¿how to set up a working OpenCL environment for arrayfire-r? I'm not interested (yet) in the arrayfire graphic capabilities. Previously, I tried to disable GLFW when compiling Arrayfire, but I got errors complaining about glfw.

Cheers,
Aurelio

afInfo()
GLX: Forward compatibility requested but GLX_ARB_create_context_profile is unavailableError: Could not Create GLFW Window!
ArrayFire v3.3.2 (OpenCL, 64-bit Linux, build f65dd97)
[0] AMD : Fiji, 3749 MB
-1- AMD : Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz, 32099 MB
a <- c(1:100)
b <- array(a)
c <- afArray(b)
f <- afArray(b, type='single')
aa <- f_f + c_c
afPrint(aa)
No Name Array
[100 1 1 1]
Error in afPrint(aa) : In function print
In file src/api/c/print.cpp:89

This is the log of installed dependencies.
This list is following the Catalyst and Arrayfire install dependencies. Something different are the lines
sudo apt-get install libglfw3
sudo apt-get install libglfw-dev

Arrayfire suggest: sudo apt-get install glfw3, but it always fails:

sudo apt-get uptdate
sudo apt-get update
sudo apt-get install gcc
sudo apt-get install dpkg
sudo apt-get build-dep lib32gcc1
sudo apt-get install dh-modaliases
sudo apt-get install execstack
sudo ./amd-driver-installer-15.302-x86.x86_64.run
sudo aticonfig --initial
apt-get install libfreeimage-dev libatlas3gf-base libfftw3-dev libglew-dev libglewmx-dev libglfw3-dev cmake
sudo apt-get install libfreeimage-dev libatlas3gf-base libfftw3-dev libglew-dev libglewmx-dev libglfw3-dev cmake
sudo apt-add-repository ppa:keithw/glfw3
sudo apt-get update
sudo apt-get install libglfw3
sudo apt-get install libglfw-dev
sudo apt-get install ocl-icd-libopencl1
sudo ./ArrayFire-v3.3.2_Linux_x86_64.sh --exclude-subdir --prefix=/usr/local
sudo dpkg -i MRO-3.2.4-Ubuntu-14.4.x86_64.deb
tar -xzf RevoMath-3.2.5.tar.gz
sudo ./RevoMath.sh
sudo dpkg -i rstudio-0.99.896-amd64.deb
sudo apt-get install libjpeg-dev libpng12-dev
sudo dpkg -i rstudio-0.99.896-amd64.deb

@AurelioG is that the end of the error message ? If not, can you run your script after running export AF_PRINT_ERRORS=1 ?

Yes it was the end of the R printout in RStudio! In the the bash terminal worked ok.
How can I make it work in RStudio??

export AF_PRINT_ERRORS=1
R CMD BATCH a.r

The a.r.Rout:

R version 3.2.4 (2016-03-16) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Microsoft R Open 3.2.4
Default CRAN mirror snapshot taken on 2016-04-01
The enhanced R distribution from Microsoft
Visit http://go.microsoft.com/fwlink/?LinkID=722555 for information
about additional features.

Multithreaded BLAS/LAPACK libraries detected. Using 12 cores for math algorithms.

library("arrayfire", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.2")

Attaching package: ‘arrayfire’

The following objects are masked from ‘package:stats’:

cov, median, sd, var

afInfo()
GLX: Forward compatibility requested but GLX_ARB_create_context_profile is unavailableError: Could not Create GLFW Window!
ArrayFire v3.3.2 (OpenCL, 64-bit Linux, build f65dd97)
[0] AMD : Fiji, 3798 MB
-1- AMD : Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz, 32099 MB
a <- c(1:100)
b <- array(a)
c <- afArray(b)
f <- afArray(b, type='single')
aa <- f_f + c_c
afPrint(aa)
No Name Array
[100 1 1 1]
2
8
18
32
50
72
98
128
162
200
242
288
338
392
450
512
578
648
722
800
882
968
1058
1152
1250
1352
1458
1568
1682
1800
1922
2048
2178
2312
2450
2592
2738
2888
3042
3200
3362
3528
3698
3872
4050
4232
4418
4608
4802
5000
5202
5408
5618
5832
6050
6272
6498
6728
6962
7200
7442
7688
7938
8192
8450
8712
8978
9248
9522
9800
10082
10368
10658
10952
11250
11552
11858
12168
12482
12800
13122
13448
13778
14112
14450
14792
15138
15488
15842
16200
16562
16928
17298
17672
18050
18432
18818
19208
19602
20000

proc.time()
user system elapsed
0.896 0.128 1.072

@AurelioG let me try out RStudio and see what's going wrong. I have only ever used command line.