Hybrid search support

Question

Hybrid search support

Jhyrachy opened this issue 2 months ago · comments

Hi,
first of all I want to let you know that I am amazed by your program.
But I was also wondering if hybrid search (vector search + full text search, like bm25) was on the roadmap, since we know that vector search is often not enough ( https://techcommunity.microsoft.com/t5/microsoft-developer-community/doing-rag-vector-search-is-not-enough/ba-p/4161073 )
Thanks a lot!

Abheek Gulati · Answer 1 · Mon Jun 10 2024 06:22:20 GMT+0800 (China Standard Time)

Hi! Thanks for reaching out and glad you liked the tool! Hybrid search isn't on the roadmap yet, but thanks for bringing it to my attention as one of key goals of future development is to improve RAG performance by going beyond just semantic-searches over chunks in a vector database.

A very common user ask & expectation is summarization: many users expect that once they're uploaded a document, the LLM has "read" it and now knows it, whereas this hasn't happened at all in a typical RAG chain! They're therefore dissapointed when they ask queries such as "can you summarize this document"? Towards this end, I'm presently exploring RAPTOR and T-RAG, as both, recursive summaries and tree-type knowledge graphs show great potential in improving document understanding and true knowledge extraction. Hybrid search sounds promising too, and I will explore it in the future.

Meantime though I welcome contributions! So if you or anyone else reading this decides to work on this aspect and send me a pull-request, I'll be happy to review and add your contribution to LARS!