SeanLee97 / AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Home Page:https://arxiv.org/abs/2309.12871

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Difference in output when running via Trasformers.js and when hosting on Huggingface

jtmuller5 opened this issue · comments

I created an application that uses the UAE-large-V1 model inside Transformers.js and was able to embed sentences in a browser without issues. The model would return a single vector for a single input:

extractor = await pipeline("feature-extraction", "WhereIsAI/UAE-Large-V1", {
      quantized: true,
});

let result = await extractor(text, { pooling: "mean", normalize: true });

When I hosted the model on Huggingface using their inference endpoint solution, it no longer works as expected. Instead of returning a single vector, it returns a variable length of 1024 dimension vectors.

Sample input:

{
   "inputs":  "Where are you"
}

This returns a list of lists of lists of numbers.

Is there a way to make hosted model return a single vector? And why does the the model act differently based on where it's hosted?

commented

It is strange. It should return a single vector because you have specified the mean pooling.

You could ask for help in the Transformers.js project because I am unfamiliar with it. Sorry for this.