jmschrei / apricot

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly. See the documentation page: https://apricot-select.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug: CustomSelection Initialize with Subset, func Attribute Update Needed

jlevy44 opened this issue · comments

self.total_gain = self.func(self.initial_subset)

This should be self.function

Thanks for the amazing package!

I've just temporarily patched on my end using class inheritance and seems to work just fine:

class CustomSelection2(CustomSelection): 
    def _initialize(self, X):
        super(CustomSelection, self)._initialize(X)

        if self.initial_subset is None:
            pass
        elif self.initial_subset.ndim == 2:
            if self.initial_subset.shape[1] != X.shape[1]:
                raise ValueError("The number of columns in the initial subset must " \
                    "match the number of columns in X.")
        elif self.initial_subset.ndim == 1:
            self.initial_subset = X[self.initial_subset]
        else:
            raise ValueError("The initial subset must be either a two dimensional" \
                " matrix of examples or a one dimensional mask.")

        if self.initial_subset is None:
            self.total_gain = 0
        else:
            self.total_gain = self.function(self.initial_subset)

Happy to PR, I just figured it's a quick fix and is probably best done on your end. Thanks!

Thanks for the report. This should be updated in 0.6.1, which I just uploaded.