Geometry `nx="250"` is misleading as it is rounded up to N*32=256.
ggruszczynski opened this issue · comments
I found that nx="250"
is misleading as it is rounded up to N*32=256.
My configuration:
./configure --enable-cpp11 --enable-rinside --with-cuda-arch=sm_60
I was running with GPU SERIAL CUDA 11.1
I was running with:
mpirun -np 1 CLB/d3q27q27_cm_cht/main some.weird.example.xml
The xml file looks like this:
<?xml version="1.0"?>
<CLBConfig version="2.0" output="output/batch_GaussianHill_same_SI_time_2100-35000000/BGK_ux_0.10e+00_k_1o100_iterations_35000_sigma_100_size_250lu/" permissive="true">
<Geometry nx="250" ny="250" nz="3">
<BGK>
<Box />
</BGK>
</Geometry>
<Model>
<Param name="nu" value="0.1666666" />
<Param name="h_stability_enhancement" value="1." />
<Param name="conductivity" value="0.010000" />
<Param name="Sigma_GH" value="100.000000" />
<Param name="cp" value="1" />
<Param name="material_density" value="1" />
<Param name="InitTemperature" value="1.0" />
<Param name="CylinderCenterX_GH" value="125.000000" />
<Param name="CylinderCenterY_GH" value="125.000000" />
<Param name="VelocityX" value="0.100000" />
</Model>
<Failcheck Iterations="1000" nx="250" ny="250" nz="3" />
<Solve Iterations="35000">
<VTK Iterations="17500" what="T,U" />
</Solve>
<Run model="d3q27q27_cm_cht_fields">
<Code version="v6.5.0-289-ge24d840" precision="double" cross="GPU" />
</Run>
</CLBConfig>
I get this error:
<Geometry nx="250" ny="250" nz="3" >
[ ] Mesh size in config file: 250x250x3
[ ] Global lattice size: 250x250x3
[ ] Max region size: 192000. Mesh size 192000. Overhead: 0%
[ ] Local lattice size: 256x250x3
[ ] Running graphics at 256x250
Due to cuda architecture, the periodic case runs on 256 cells in X direction instead of 250.
The remaining 6 cells should be of ghost-type.
It is quite an annoying issue when calibrating the case, for instance, the full roll takes 256 / velocity instead of 250 / velocity iterations). Similar issue may happen if one would try to calibrate gravity/pressure.
The paraview files are also 256x250x3.
Obviosly, one may run but it some time while to discover the issue :/
To sum up, it is easy to get tricked by nx="250"...
The x direction has to be divisible by 32 (not 16). That's due to the distribution of threads in CUDA. It is like this from the begining of TCLB and I don't see no easy way to change it now. @ggruszczynski if you have a good idea where to put it in documentation or warning - please suggest.
I know it's from the beginning of TCLB.
Can we introduce a ghost-type cells which will not influence the final result or just block attempts of running simulations other than nx=N*32?