castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Home Page:http://pyserini.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to add stop words when building BM25 index?

RulinShao opened this issue · comments

Hello! Thanks for the great package! I'm wondering how I could pass stop words to pyserini when building a BM25 index? The default is not using any stop words, but I want to accelerate the construction and inference process by removing stop words. Thanks a lot!