NVHPC Compiles for unnecessary GPU hardware with default options
wilfonba opened this issue · comments
Ben Wilfong commented
Describe the bug
Without specifying --gpu=ccXY
, NVHPC compiles code for sm_35
, sm_50
, sm_60
, sm_61
, sm_70
, sm_75
and sm_80
. Some of these correspond to GPUs like the Tesla K20 that was released over 10 years ago, and I'm sure no one is ever going to run MFC on. An alternative compiler option is --cnative,
which automatically detects what hardware is available. Baring any decrease in performance, either of these approaches could make the code compiler several times faster for GPUs.
To do:
- - See if
--cnative
works on the computers commonly used for MFC - - See if the test suite passes with
--gpu=cnative
or--gpu=ccXY
- - See if performance is affected with
--gpu=cnative
or--gpu=ccXY
Credit: @henryleberre.