scikit-image / scikit-image

Image processing in Python

Home Page:https://scikit-image.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

skimage.util.apply_parallel doesn't work with skimage.transform.downscale_local_mean when applied on a dask array

rcremese opened this issue · comments

Description:

What I'm experiencing :
I tried to use for the first time dask arrays to process 3D volumes with scikit-image.
I started by using the skimage.util.apply_parallel function provided by the API and get an error when calling the compute() method of the return dask array.

The dask array dimension mismatched and the stack trace error indicate a type-check hasn't been done.

What I would expect:

  • the array size to be downscaled to (100, 100) after calling the apply_parallel function
  • the resulting array to be a downscaled version of the input array after calling the compute()

How I solved the problem temporaly:
By calling directly the da.map_blocks() function with the correct arguments

downscaled = da.map_blocks(
    skimage.transform.downscale_local_mean, 
    rand_array, 
    factors=(10, 10),
    chunks = (10, 10),
    dtype=rand_array.dtype,
)

Traceback of the provided exemple:

{
	"name": "TypeError",
	"message": "'<' not supported between instances of 'str' and 'int'",
	"stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[38], line 1
----> 1 downscaled.compute()

File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\dask\\base.py:379, in DaskMethodsMixin.compute(self, **kwargs)
    355 def compute(self, **kwargs):
    356     \"\"\"Compute this dask collection
    357 
    358     This turns a lazy Dask collection into its in-memory equivalent.
   (...)
    377     dask.compute
    378     \"\"\"
--> 379     (result,) = compute(self, traverse=False, **kwargs)
    380     return result

File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\dask\\base.py:665, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    662     postcomputes.append(x.__dask_postcompute__())
    664 with shorten_traceback():
--> 665     results = schedule(dsk, keys, **kwargs)
    667 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\util\\apply_parallel.py:191, in apply_parallel.<locals>.wrapped_func(arr)
    190 def wrapped_func(arr):
--> 191     return function(arr, *extra_arguments, **extra_keywords)

File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\transform\\_warps.py:453, in downscale_local_mean(image, factors, cval, clip)
    409 def downscale_local_mean(image, factors, cval=0, clip=True):
    410     \"\"\"Down-sample N-dimensional image by local averaging.
    411 
    412     The image is padded with `cval` if it is not perfectly divisible by the
   (...)
    451 
    452     \"\"\"
--> 453     return block_reduce(image, factors, np.mean, cval)

File d:\\anaconda3\\envs\
eural-tracing-env\\lib\\site-packages\\skimage\\measure\\block.py:74, in block_reduce(image, block_size, func, cval, func_kwargs)
     72 pad_width = []
     73 for i in range(len(block_size)):
---> 74     if block_size[i] < 1:
     75         raise ValueError(\"Down-sampling factors must be >= 1. Use \"
     76                          \"`skimage.transform.resize` to up-sample an \"
     77                          \"image.\")
     78     if image.shape[i] % block_size[i] != 0:

TypeError: '<' not supported between instances of 'str' and 'int'"
}

Way to reproduce:

import skimage
import dask.array as da

rand_array = da.random.random((1000, 1000), chunks=(100, 100))
downscaled = skimage.util.apply_parallel(skimage.transform.downscale_local_mean, rand_array,
            extra_arguments={"factors": (10, 10)}, dtype=rand_array.dtype
downscaled.compute()

Version information:

3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:29:04) [MSC v.1929 64 bit (AMD64)]
Windows-10-10.0.22631-SP0
scikit-image version: 0.22.0
numpy version: 1.26.3
dask version: 2024.1.0

Ah! The error is that extra_arguments is used for positional arguments. You should be using extra_keywords for keyword arguments. Using this line:

downscaled = skimage.util.apply_parallel(
        skimage.transform.downscale_local_mean,
        rand_array,
        extra_keywords={"factors": (10, 10)},  # <-- here
        dtype=rand_array.dtype,
        )

fixes the problem.

Potentially, there is a discussion to be had about validating the type of extra_arguments. But maybe that should be the job of the type checker, not a runtime check...

As a side note, for this particular function, dask array has a built-in function called dask.array.coarsen.

Agreed. This is potentially a something we could address with a nicer error or warning message but doesn't seem like a bug. 👍

@rcremese, if you are interested in adding a nicer error message for cases like this, please feel very welcome to make a PR and request a review from me.

I'll close this as resolved for now, but don't hesitate to re-open or add something. Thanks for reaching out.