skimage.util.apply_parallel doesn't work with skimage.transform.downscale_local_mean when applied on a dask array
rcremese opened this issue · comments
Description:
What I'm experiencing :
I tried to use for the first time dask arrays to process 3D volumes with scikit-image.
I started by using the skimage.util.apply_parallel
function provided by the API and get an error when calling the compute()
method of the return dask array.
The dask array dimension mismatched and the stack trace error indicate a type-check hasn't been done.
What I would expect:
- the array size to be downscaled to (100, 100) after calling the
apply_parallel
function - the resulting array to be a downscaled version of the input array after calling the
compute()
How I solved the problem temporaly:
By calling directly the da.map_blocks()
function with the correct arguments
downscaled = da.map_blocks(
skimage.transform.downscale_local_mean,
rand_array,
factors=(10, 10),
chunks = (10, 10),
dtype=rand_array.dtype,
)
Traceback of the provided exemple:
{
"name": "TypeError",
"message": "'<' not supported between instances of 'str' and 'int'",
"stack": "---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[38], line 1
----> 1 downscaled.compute()
File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\dask\\base.py:379, in DaskMethodsMixin.compute(self, **kwargs)
355 def compute(self, **kwargs):
356 \"\"\"Compute this dask collection
357
358 This turns a lazy Dask collection into its in-memory equivalent.
(...)
377 dask.compute
378 \"\"\"
--> 379 (result,) = compute(self, traverse=False, **kwargs)
380 return result
File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\dask\\base.py:665, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
662 postcomputes.append(x.__dask_postcompute__())
664 with shorten_traceback():
--> 665 results = schedule(dsk, keys, **kwargs)
667 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\util\\apply_parallel.py:191, in apply_parallel.<locals>.wrapped_func(arr)
190 def wrapped_func(arr):
--> 191 return function(arr, *extra_arguments, **extra_keywords)
File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\transform\\_warps.py:453, in downscale_local_mean(image, factors, cval, clip)
409 def downscale_local_mean(image, factors, cval=0, clip=True):
410 \"\"\"Down-sample N-dimensional image by local averaging.
411
412 The image is padded with `cval` if it is not perfectly divisible by the
(...)
451
452 \"\"\"
--> 453 return block_reduce(image, factors, np.mean, cval)
File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\measure\\block.py:74, in block_reduce(image, block_size, func, cval, func_kwargs)
72 pad_width = []
73 for i in range(len(block_size)):
---> 74 if block_size[i] < 1:
75 raise ValueError(\"Down-sampling factors must be >= 1. Use \"
76 \"`skimage.transform.resize` to up-sample an \"
77 \"image.\")
78 if image.shape[i] % block_size[i] != 0:
TypeError: '<' not supported between instances of 'str' and 'int'"
}
Way to reproduce:
import skimage
import dask.array as da
rand_array = da.random.random((1000, 1000), chunks=(100, 100))
downscaled = skimage.util.apply_parallel(skimage.transform.downscale_local_mean, rand_array,
extra_arguments={"factors": (10, 10)}, dtype=rand_array.dtype
downscaled.compute()
Version information:
3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:29:04) [MSC v.1929 64 bit (AMD64)]
Windows-10-10.0.22631-SP0
scikit-image version: 0.22.0
numpy version: 1.26.3
dask version: 2024.1.0
Ah! The error is that extra_arguments
is used for positional arguments. You should be using extra_keywords
for keyword arguments. Using this line:
downscaled = skimage.util.apply_parallel(
skimage.transform.downscale_local_mean,
rand_array,
extra_keywords={"factors": (10, 10)}, # <-- here
dtype=rand_array.dtype,
)
fixes the problem.
Potentially, there is a discussion to be had about validating the type of extra_arguments. But maybe that should be the job of the type checker, not a runtime check...
As a side note, for this particular function, dask array has a built-in function called dask.array.coarsen
.
Agreed. This is potentially a something we could address with a nicer error or warning message but doesn't seem like a bug. 👍
@rcremese, if you are interested in adding a nicer error message for cases like this, please feel very welcome to make a PR and request a review from me.
I'll close this as resolved for now, but don't hesitate to re-open or add something. Thanks for reaching out.