embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark

Home Page:https://arxiv.org/abs/2210.07316

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Convert French Clustering tasks to Fast

imenelydiaker opened this issue · comments

For the French benchmark, convert clustering tasks to fast using AbsTaskClusteringFast:

  • AlloProfClusteringP2P
  • AlloProfClusteringS2S
  • HALClusteringS2S #814
  • MasakhaNEWSClusteringP2P (less than 1000 samples per language)
  • MasakhaNEWSClusteringS2S (less than 1000 samples per language)
  • MLSUMClusteringP2P #798
  • MLSUMClusteringS2S #798

Priority for tasks in bold because they contain a big number of samples.

Looks like this issue will be addressing the first 3 dataset mentioned in the description. @imenelydiaker will you be working on these? or could we assign this to one of the new contributors?

Looks like this issue will be addressing the first 3 dataset mentioned in the description. @imenelydiaker will you be working on these? or could we assign this to one of the new contributors?

Yes it could be assigned to any person who'd like to contribute!