texttron / hyde

HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to prebuilt the Contriever faiss index

bryanyzhu opened this issue · comments

commented

Thanks for the great code, can I ask how to prebuilt the Contriever faiss index? Basically, given a folder of documents, I can use Contriever to embed them, but how to index them to get the document like contriever_msmarco_index.tar.gz for search? Thank you.

Hi, the contriever_msmarco_index contains two files
docid and index.
The docid is a text file where each line is the id of a document.
The index file is a faiss index saved by faiss.write_index(index,'contriever_msmarco_index /index')