ComputeKernel and Compute + GVDB_VOX performs redundant (+ inaccurate?) computations
icoderaven opened this issue · comments
Hi!
So I was investigating the UpdateApron method and while dealing with it I realised that the GVDB_VOX macro used in kernels that run over atlas space bricks, and the call to said kernels doesn't make a lot of sense.
GVDB_VOX defines the voxel id as
uint3 vox = blockIdx * make_uint3(blockDim.x, blockDim.y, blockDim.z) + threadIdx + make_uint3(1,1,1);
Clearly, the inherent assumption being made here (and further verified by looking at the macros defined just above such as GVDB_COPY_SMEM_UC) is that the apron is 1 unit wide, and the blocks are 8x8x8 threads. However, the ComputeKernel calls are over a grid of all atlas bricks including apron width (getAtlasRes will return N*(8+2)) - so assuming that the leaf nodes are 8x8x8 voxels then the grid runs over 10*N/8 + 1 in each dimension, where N is the actual number of atlas bricks.
Instead I propose modifying the calls to the Compute kernels to run the grids over the N bricks themselves, and modifying the GVDB_VOX macro such that it only operates on the 8x8x8 actual data voxels of the brick and leave the apron voxels alone. They should be updated by the UpdateApron method calls anyway.
Of course the macro doesn't work for a different GVDB level -1 configuration, or for apron size other than 1, but it doesn't seem like the other macros will either.
Example in this commit
icoderaven@2adf6c4
Hi icoderaven,
I've put a bit of work up on modifying these kernels and callers to skip over apron voxels on the new packed-compute branch, if you want to take an early look at it and see if it's compatible with your workflow! I suspect there are probably a few bugs in it at the moment, though.
This adds a new parameter to Compute
and ComputeKernel
, which is an API-breaking change. The reason for this is so that developers writing kernels passed to ComputeKernel
know that they may need to switch to using the smaller grid size, or to specify that they're using the larger grid size (iterating over the entire atlas). It should be compatible with any tree whose bricks have dimensions divisible by 8, and is intended to work with different apron sizes, should apron sizes other than 1 be enabled in the future.
Hope this helps!
Hi icoderaven,
I just fixed the bugs I found in the packed-compute branch and verified that the samples work - if it looks good with you, I'll go ahead and merge it into master. Note that for the filling and resampling kernels, we preserve their previous behavior (of also modifying atlas voxels), so that people will have to change less code downstream (I hope).
Thanks!
This looks great! Apologies, was working on a deadline until today. I'll have a proper look at this over the weekend, but a quick scan seems to look good! Thanks!
Merged in 4ea8fc3!