castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Home Page:http://pyserini.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

version conflict: the doc of experiments-nfcorpus.md

Adeshen opened this issue · comments

It says that

python -m pyserini.encode \
  input   --corpus collections/nfcorpus/corpus.jsonl \
          --fields title text \
  output  --embeddings indexes/nfcorpus.bge-base-en-v1.5 \
          --to-faiss \
  encoder --encoder BAAI/bge-base-en-v1.5 **--l2-norm** \
          --device cpu \
          --pooling mean \
          --fields title text \
          --batch 32

python -m pyserini.encode \
  input   --corpus collections/nfcorpus/corpus.jsonl \
          --fields title text \
  output  --embeddings indexes/nfcorpus.bge-base-en-v1.5 \
          --to-faiss \
  encoder --encoder BAAI/bge-base-en-v1.5 --l2-norm \
          --device cpu \
          --pooling mean \
          --fields title text \
          --batch 32

but in the latest version, it could find --l2-norm --pooling mean

I think the maintainers maybe forget the issue

@MXueguang this is related to a recent PR you worked on?

sorry @Adeshen, I didn't get what the issue is, could you explain a bit more details.

Having heard no further follow-up, closing.