error when providing initial_indices for sparse array data
chschroeder opened this issue · comments
Hi,
at first glance this library looks really nice (with regard to API, code and docs) and i really like it. Kudos for that!
The first steps were easy to follow using the examples.
However, when i switched from dense to sparse arrays i had some troubles:
Is FeatureBasedSelection
in combination with the initial_subset
argument intended to work on sparse arrays?
According to the documentation, scipy's csr_matrix should be supported, right?
(1) without initial_subset
selector = FeatureBasedSelection(n, concave_func='sqrt')
selector.fit(x)
(2) with initial_subset
selector = FeatureBasedSelection(n, concave_func='sqrt', initial_subset=initial_subset)
selector.fit(x)
Whenever x is an ndarray (dense) both variants work fine.
However, for a csr_matrix (sparse) only the former works, and for the latter i get the following error:
File "<my_workspace>/my_script.py", line 86, in my_func
selector.fit(x)
File "<site-packges>/apricot/functions/featureBased.py", line 265, in fit
return super(FeatureBasedSelection, self).fit(X, y=y,
File "<site-packges>/apricot/functions/base.py", line 251, in fit
optimizer.select(X, self.n_samples, sample_cost=sample_cost)
File "<site-packges>/apricot/optimizers.py", line 491, in select
optimizer1.select(X, self.n_first_selections, sample_cost=sample_cost)
File "<site-packges>/apricot/optimizers.py", line 234, in select
gains = self.function._calculate_gains(X) / sample_cost[self.function.idxs]
File "<site-packges>/apricot/functions/featureBased.py", line 321, in _calculate_gains
concave_func(X.data, X.indices, X.indptr, gains,
File "<site-packges>/numba/core/dispatcher.py", line 608, in _explain_matching_error
raise TypeError(msg)
TypeError: No matching definition for argument type(s) array(float64, 1d, C), array(int32, 1d, C), array(int32, 1d, C), array(float64, 1d, C), array(float64, 2d, C), array(float64, 2d, C), array(int64, 1d, C)```
Howdy
Thanks for reporting this. It does look like a bug on my end. The selectors are supposed to work with both dense and sparse arrays, even when using an initial subset. I'll try to fix it in the next week ortwo. Sorry about that! If you need a fix sooner than that you should go into the FeatureBasedSelection code and just hard-code the gain _select_next
function that you want to use.
Thanks for the quick response! There is no hurry at all. I am happy to hear that there will be a fix.