AMReX-Astro / mini-Castro

a mini-app version of castro

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

StarLord does not work with the IBM compiler on GPUs

maxpkatz opened this issue · comments

It compiles, but completely fails at runtime. Need to investigate why.

One issue is that by default we're requesting up to 512 threads per threadblock, but with the default maximum of 255 registers per thread, this is too many registers per threadblock. For PGI we deal with this by explicitly limiting to 128 registers per kernel in the StarLord makefile. Need to do the equivalent for IBM.

According to IBM, this can be done with:

-Xptxas -maxrregcount=128