cknowledge.org/ai: Crowdsourcing benchmarking and optimisation of AI
NB: The Caffe experimental results are released with approval from General Motors.
The Jupyter notebook (view on github.com; view on nbviewer.jupyter.org) in this Collective Knowledge repository analyses the performance (execution time, memory consumption):
- on dividiti's velociti Hewlett-Packard Z640 Workstation (G1X62EA):
- Intel(R) Xeon(R) CPU E5-2650 v3:
- 10 cores, 20 threads;
- Base clock 2300 MHz, turbo clock 3000 MHz;
- Max power consumption 105 Watt;
- Max memory bandwidth 68 GB/s;
- RAM memory 32 GB DDR4;
- Intel(R) Xeon(R) CPU E5-2650 v3:
$ uname -a
Linux velociti 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
-
NVIDIA GeForce GTX 1080 "Founders Edition":
- Pascal architecture;
- 2560 CUDA cores;
- Base clock 1607 MHz, boost clock 1733 MHz;
- Max power consumption 180 Watt;
- RAM memory 8 GB GDDR5X;
- Max memory bandwidth 320 GB/s;
- GPU Driver 367.57 [10/Oct/2016];
- CUDA Toolkit 8.0.44 [xx/Sep/2016].
-
using 14 Caffe libraries:
- [
tag
] Branch (revision hash, date): math libraries. - [
cpu
] Master (4ba654f, 5/Oct/2016): with OpenBLAS 0.2.19; - [
cuda
] Master (4ba654f, 5/Oct/2016): with cuBLAS (part of CUDA Toolkit 8.0.44); - [
cudnn
] Master (4ba654f, 5/Oct/2016): with cuDNN 5.1; - [
nvidia-cuda
] NVIDIA v0.15 (1024d34, 17/Nov/2016): with cuBLAS (part of CUDA Toolkit 8.0.44); - [
nvidia-cudnn
] NVIDIA v0.15 (1024d34, 17/Nov/2016): with cuDNN 5.1; - [
nvidia-fp16-cuda
] NVIDIA experimental/fp16 (fca1cf4, 11/Jul/2016): with cuBLAS (part of CUDA Toolkit 8.0.44); - [
nvidia-fp16-cudnn
] NVIDIA experimental/fp16 (fca1cf4, 11/Jul/2016): with cuDNN 5.1; - [
clblas
] OpenCL (9abafdc, 7/Oct/2016): with ViennaCL 1.7.1 and clBLAS 2.10; - [
clblast
] OpenCL (9abafdc, 7/Oct/2016): with ViennaCL 1.7.1 and CLBlast 0.9.0; - [
viennacl
] OpenCL (9abafdc, 7/Oct/2016): with ViennaCL 1.7.1 only; - [
libdnn-cuda
] OpenCL (cfaaae1, 25/Oct/2016): with libDNN and cuBLAS; - [
libdnn-clblas
] OpenCL (cfaaae1, 25/Oct/2016): with libDNN, ViennaCL 1.7.1 and clBLAS 2.10; - [
libdnn-clblast
] OpenCL (cfaaae1, 25/Oct/2016): with libDNN, ViennaCL 1.7.1 and CLBlast 0.9.0; - [
libdnn-viennacl
] OpenCL (cfaaae1, 25/Oct/2016): with libDNN and ViennaCL 1.7.1.
- [
-
using 4 CNN models:
-
with the batch size varying from 2 to 16 with step 2.