Training gets stuck in a loop [running kmeans]

Question

Training gets stuck in a loop [running kmeans]

arthur-ver opened this issue a year ago · comments

I have 18k training samples with embedding dimension of 4. After about 150 epochs and a few kmeans splitting operations the training gets stuck in a loop during another kmeans operation with center_shift=nan

meitarronen · Answer 1 · Wed Apr 12 2023 18:09:51 GMT+0800 (China Standard Time)

Hey @arthur-ver thanks for your interest!
This might be a result of over-splitting (one of the k-means centroids is an outlier and the mean of its neighbours cannot be computed - because they are none).
I would try to play with the --prior_sigma_scale param (default is .005) I would try increasing it in order to lower the probability of a split.

Another option would be to change the embedding representation with --transform_input_data (available choices are: "normalize", "min_max", "standard", "standard_normalize", "None"), or to stop training in an earlier epoch.