BGU-CS-VIL / DeepDPM

"DeepDPM: Deep Clustering With An Unknown Number of Clusters" [Ronen, Finder, and Freifeld, CVPR 2022]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training gets stuck in a loop [running kmeans]

arthur-ver opened this issue · comments

commented

I have 18k training samples with embedding dimension of 4. After about 150 epochs and a few kmeans splitting operations the training gets stuck in a loop during another kmeans operation with center_shift=nan

kmeans

Hey @arthur-ver thanks for your interest!
This might be a result of over-splitting (one of the k-means centroids is an outlier and the mean of its neighbours cannot be computed - because they are none).
I would try to play with the --prior_sigma_scale param (default is .005) I would try increasing it in order to lower the probability of a split.

Another option would be to change the embedding representation with --transform_input_data (available choices are: "normalize", "min_max", "standard", "standard_normalize", "None"), or to stop training in an earlier epoch.