[QUESTION] How to use prompt C when using through HuggingFace embeddings loader

Question

[QUESTION] How to use prompt C when using through HuggingFace embeddings loader

kairoswealth opened this issue 5 months ago · comments

I am using Llamaindex to index documents into chromadb and for that I use the HuggingFaceEmbedding abstraction like that:

embed_model = HuggingFaceEmbedding(model_name="WhereIsAI/UAE-Large-V1")

However I read that one need to specify prompt C in order to optimize the embedding for retrieval.

is the prompt only used during retrieval? ie for the question embedding? or also for documents indexing?
any idea if that setting is supported through HuggingFace//Llamaindex abstractions, and how?
in the event that prompt C arg is not supported, would the resulting vector be significantly performing less in retrieval use cases?

Sean · Answer 1 · Wed Jan 17 2024 22:22:46 GMT+0800 (China Standard Time)

For question:

yes, just use it for the query texts, do not use it for document indexing.

2&3. Sorry, I haven't used Llamaindex. Maybe you can manually apply the prompt to the query text as follows:

from angle_emb import Prompts

query_text = 'this is a query'
query_text = Prompts.C.format(text=query_text)

embeddings = embed_model.get_text_embedding(query_text)
...

kairoswealth · Answer 2 · Wed Jan 17 2024 22:24:23 GMT+0800 (China Standard Time)

Awesome, that is very clear now. I'll apply the prompt manually on retrieval. Thanks a lot!