Trying to reproduce the TC score for Trump dataset
chunfortam opened this issue · comments
Hi Maarten,
I am trying to reproduce the TC score of 0.066 for the Trump dataset with MPNET SBERT models, but I have been getting various results from -0.01x to 0.03 after averaging the 15 runs. I understand there is randomness introduced by UMAP, but I'd like to know if there's more reason for it. I followed the Python notebook and used the same dataset and wondering what's your thought on this.
Regards,
Chun
Did you make sure to use the versions as specified in the notebook? BERTopic, and its dependencies, have gone through several changes over the years which would explain some of the differences.
I think that was it, thanks!