LINCellularNeuroscience / VAME

Variational Animal Motion Embedding - A tool for time series embedding and clustering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing clusters in resulting npy-files

roessler-f opened this issue · comments

Hi,

I used VAME on some of our tracking data and I first wanted to say: Thanks for the user-friendly tool, it is very well set up and easy to use.

I have a question about the final clustering of the motifs: I set the number of clusters to 65 (so n_cluster: 65 in the config.yaml file). However, when I go through the resulting npy-files, certain clusters never show up (e.g. cluster 9 or cluster 33 never appear in any file). How is this possible? Shouldn't every cluster be represented at least once in the dataset, even if it is just for a few frames in one file?

Thanks already in advance for your help.

Best,
Fabienne

Hi Fabienne,

Glad you like the tool :)

Are you absolutely sure that cluster doesnt show up in any of your files? Often its the takes that only one of X files contains an uncommon or outlier pattern and then some clusters end up being aligned around that file only. Otherwise, which clustering method did you use, HMM or K-means?

Best,
Pavol

Hi Pavol,

Thanks a lot for your fast answer.

I'm very sure. I looped through all npy-files and checked in each of them. I also found some clusters that appear only in one file for about 2 or 3 frames which would be the outlier patterns you mention.

I'm using the HMM clustering. I'm no expert on HMMs, so maybe there it's possible that some clusters never appear in the data?

Best,
Fabienne

Hi Fabienne,

Ok, thanks for the feedback, will look if that outcome is reproducible with a high number of motifs and HMM clustering. Meanwhile, you could check if the problem is the same when using kmeans.

Best,
Pavol

Hi Pavol,

I just realized that the function we use to load the files (which was not written by myself) removes some of the first and last frames of the tracking data and therefore also the clustering data. The above mentioned clusters seemed to have represented some behavior (or rather noise) in the beginning and end of our recordings and were therefore completely removed. When I checked the "raw" files, the clusters were there again.

I'm very sorry for this mistake from my side!

Thank you for your time and efforts.

Best,
Fabienne