block_reduce type functions
rabernat opened this issue · comments
I'm in need of a gufunc that does something like scikit-image's block_reduce function, but in n-dimensions.
As a simple example, I want something like this
>>> x = np.ones(8)
>>> block_reduce(x, 2, how='sum')
[2, 2, 2, 2]
I would like to generalize this to ndims, have various options for reduction, and also possibly provide weights.
Maybe numbagg already does this? If not, is it in scope?
I think this would be in scope for numbagg.
sckit-image's block_reduce does work in n-dimensions, e.g., block_reduce(np.ones((5, 2, 2)), (1, 2, 2), how='sum').shape == (5, 1, 1)
. But I agree that it has limitations, e.g., you have to pad the array if the shape is not exactly divided by the blocks.
A more flexible version might let you provide block IDs along each axis, sort of a multi-dimensional version of group_nanmean()
. It might be a little tricky to squeeze into a gufunc since you would need a variable number of labels, depending on the number of axes, e.g.,
@guvectorize(
[(float64[:, :], int64[:], int64[:], float64[:, :])],
signature='(i,j),(i),(j)->(k,m)',
)
def block_nanmean_2d(values, labels_x, labels_y, out):
...
This function could probably be built dynamically for an arbitrary number of label dimensions, but it would be a bit of a pain.
Maybe there's something clever you could do to avoid this. E.g., if you're willing to allocate a full array of label indices for each point, you could make the signature something like (i,j),(i,j,2)->(k,m)
.