CFD-GO / TCLB

TCLB - Templated MPI+CUDA/CPU Lattice Boltzmann code

Home Page:https://tclb.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Geometry `nx="250"` is misleading as it is rounded up to N*32=256.

ggruszczynski opened this issue · comments

I found that nx="250" is misleading as it is rounded up to N*32=256.

My configuration:

./configure --enable-cpp11 --enable-rinside --with-cuda-arch=sm_60

I was running with GPU SERIAL CUDA 11.1

I was running with:

mpirun -np 1 CLB/d3q27q27_cm_cht/main some.weird.example.xml

The xml file looks like this:

<?xml version="1.0"?>
<CLBConfig version="2.0" output="output/batch_GaussianHill_same_SI_time_2100-35000000/BGK_ux_0.10e+00_k_1o100_iterations_35000_sigma_100_size_250lu/" permissive="true">
	<Geometry nx="250" ny="250" nz="3">
		<BGK>
			<Box />
		</BGK>
	</Geometry>
	<Model>
		<Param name="nu" value="0.1666666" />
		<Param name="h_stability_enhancement" value="1." />
		<Param name="conductivity" value="0.010000" />
		<Param name="Sigma_GH" value="100.000000" />
		<Param name="cp" value="1" />
		<Param name="material_density" value="1" />
		<Param name="InitTemperature" value="1.0" />
		<Param name="CylinderCenterX_GH" value="125.000000" />
		<Param name="CylinderCenterY_GH" value="125.000000" />
		<Param name="VelocityX" value="0.100000" />
	</Model>
	<Failcheck Iterations="1000" nx="250" ny="250" nz="3" />
	<Solve Iterations="35000">
		<VTK Iterations="17500" what="T,U" />
	</Solve>
	<Run model="d3q27q27_cm_cht_fields">
		<Code version="v6.5.0-289-ge24d840" precision="double" cross="GPU" />
	</Run>
</CLBConfig>

I get this error:

<Geometry nx="250" ny="250" nz="3" >


[  ] Mesh size in config file: 250x250x3
[  ] Global lattice size: 250x250x3
[  ] Max region size: 192000. Mesh size 192000. Overhead:  0%
[  ] Local lattice size: 256x250x3
[  ] Running graphics at 256x250

Due to cuda architecture, the periodic case runs on 256 cells in X direction instead of 250.
The remaining 6 cells should be of ghost-type.
It is quite an annoying issue when calibrating the case, for instance, the full roll takes 256 / velocity instead of 250 / velocity iterations). Similar issue may happen if one would try to calibrate gravity/pressure.
The paraview files are also 256x250x3.
Obviosly, one may run but it some time while to discover the issue :/
To sum up, it is easy to get tricked by nx="250"...

The x direction has to be divisible by 32 (not 16). That's due to the distribution of threads in CUDA. It is like this from the begining of TCLB and I don't see no easy way to change it now. @ggruszczynski if you have a good idea where to put it in documentation or warning - please suggest.

I know it's from the beginning of TCLB.
Can we introduce a ghost-type cells which will not influence the final result or just block attempts of running simulations other than nx=N*32?