gittar / bkmeans

The breathing k-means algorithm (just one source file containing the algorithm as found on pypi)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sample_weight for BKMeans?

bjwie opened this issue · comments

commented

hey,

is there a chance to get parameter "sample_weight" working with BKMeans?
I couldnt make it with
kmeans = BKMeans(n_clusters=nr_clusters).fit_predict(X=X.reshape(-1, 1), sample_weight=weight)

TypeError: fit() got an unexpected keyword argument 'sample_weight'

thanks and cheers
björn

commented

I saw that you disabled sample_weight.
assert sample_weight is None, "sample_weight not supported"

But I found a solution, it may doesnt work like it should.

        BKMeans= BKMeans(n_clusters=nr_clusters).fit(X=X.reshape(-1, 1))
        BKMeans= BKMeans.predict(X=X.reshape(-1, 1), sample_weight=weight)

What do you think Bernd?

Thx for your question. I am pretty sure that sample_weight would have to be used already in the fit() function and that one would have to forward it to the underlying KMeans.fit(). Using it only in predict() is "too late" since the cluster centers have already been determined at this point without the information in sample_weight and are thus not locally optimal wrt. Xand sample_weight.

I may not have the time to do this soon, but I would consider accepting a pull request if you like to try it yourself (and show off your skills doing it :-) )

commented

Done :)
#3