castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Home Page:http://pyserini.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

test cases time out

sueszli opened this issue · comments

unit tests: timed out

unit tests throw some errors and then time out.

(pyserini) ❯ python -m unittest

.......Traceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

FTraceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

FSome weights of the model checkpoint at facebook/dpr-ctx_encoder-multiset-base were not used when initializing DPRContextEncoder: ['ctx_encoder.bert_model.pooler.dense.bias', 'ctx_encoder.bert_model.pooler.dense.weight']

- This IS expected if you are initializing DPRContextEncoder from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

- This IS NOT expected if you are initializing DPRContextEncoder from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 

The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 

The class this function is called from is 'DPRContextEncoderTokenizer'.

...Traceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

FTraceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

F.huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...

To disable this warning, you can either:

- Avoid using `tokenizers` before the fork if possible

- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

11it [00:00, 49344.75it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 192238.93it/s]

3it [00:00, 93902.33it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 174762.67it/s]

11it [00:00, 128159.29it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 334328.58it/s]

3it [00:00, 111353.20it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 187804.66it/s]

3it [00:00, 103991.01it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 233016.89it/s]

.....Downloading index at https://github.com/castorini/anserini-data/raw/master/CACM/lucene-index.cacm.20221005.252b5e.tar.gz...

lucene-index.cacm.tar.gz: 2.24MB [00:00, 3.07MB/s]                                                                                             

..Traceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

FETraceback (most recent call last):

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 196, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "/opt/homebrew/anaconda3/envs/pyserini/lib/python3.10/runpy.py", line 86, in _run_code

    exec(code, run_globals)

  File "/Users/sueszli/dev/pyserini/pyserini/encode/__main__.py", line 21, in <module>

    from pyserini.encode import DprDocumentEncoder, TctColBertDocumentEncoder, AnceDocumentEncoder, AggretrieverDocumentEncoder, AutoDocumentEncoder, CosDprDocumentEncoder

ImportError: cannot import name 'CosDprDocumentEncoder' from 'pyserini.encode' (/Users/sueszli/dev/pyserini/pyserini/encode/__init__.py)

FEWARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.

2023-12-03 03:17:35,484 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,485 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,485 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,485 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,567 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,567 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,567 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,567 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,579 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,579 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,579 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,579 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,595 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,595 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,595 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,595 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,608 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,608 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,608 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,608 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,625 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,626 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,626 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,626 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,642 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,642 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,642 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,642 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,671 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,672 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,672 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,672 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,693 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,693 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,693 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,693 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

2023-12-03 03:17:35,716 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,716 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,716 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,716 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,745 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,745 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,746 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,746 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,758 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,758 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,758 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,758 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,777 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,777 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,777 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,777 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,802 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,802 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,802 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,802 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,818 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,818 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,818 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,818 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,831 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:142) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:35,831 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:143) - Stemmer: porter

2023-12-03 03:17:35,831 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:144) - Keep stopwords? false

2023-12-03 03:17:35,831 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:145) - Stopwords file: null

.2023-12-03 03:17:35,844 INFO  [main] index.SimpleIndexer (SimpleIndexer.java:138) - Using WhitespaceAnalyzer

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 121.60it/s]

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 265.76it/s]

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 272.55it/s]

.......[0.003s][warning][os,thread] Attempt to protect stack guard pages failed (0x00000001695f8000-0x0000000169604000).

[0.003s][warning][os,thread] Attempt to deallocate stack guard pages failed.

WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.

2023-12-03 03:17:41,126 INFO  [main] index.IndexCollection (IndexCollection.java:350) - Setting log level to INFO

2023-12-03 03:17:41,126 INFO  [main] index.IndexCollection (IndexCollection.java:353) - Starting indexer...

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:354) - ============ Loading Parameters ============

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:355) - DocumentCollection path: tests/resources/sample_collection_json_emoji

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:356) - CollectionClass: JsonCollection

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:357) - Generator: DefaultLuceneDocumentGenerator

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:358) - Threads: 1

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:359) - Language: en

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:360) - Stemmer: porter

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:361) - Keep stopwords? false

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:362) - Stopwords: null

2023-12-03 03:17:41,127 INFO  [main] index.IndexCollection (IndexCollection.java:363) - Store positions? false

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:364) - Store docvectors? true

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:365) - Store document "contents" field? false

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:366) - Store document "raw" field? false

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:367) - Additional fields to index: []

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:368) - Optimize (merge segments)? false

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:369) - Whitelist: null

2023-12-03 03:17:41,128 INFO  [main] index.IndexCollection (IndexCollection.java:370) - Pretokenized?: false

2023-12-03 03:17:41,129 INFO  [main] index.IndexCollection (IndexCollection.java:371) - Index path: temp_index

2023-12-03 03:17:41,130 INFO  [main] index.IndexCollection (IndexCollection.java:451) - ============ Indexing Collection ============

2023-12-03 03:17:41,134 INFO  [main] index.IndexCollection (IndexCollection.java:438) - Using DefaultEnglishAnalyzer

2023-12-03 03:17:41,134 INFO  [main] index.IndexCollection (IndexCollection.java:439) - Stemmer: porter

2023-12-03 03:17:41,134 INFO  [main] index.IndexCollection (IndexCollection.java:440) - Keep stopwords? false

2023-12-03 03:17:41,135 INFO  [main] index.IndexCollection (IndexCollection.java:441) - Stopwords file: null

2023-12-03 03:17:41,184 INFO  [main] index.IndexCollection (IndexCollection.java:480) - Thread pool with 1 threads initialized.

2023-12-03 03:17:41,185 INFO  [main] index.IndexCollection (IndexCollection.java:482) - Initializing collection in tests/resources/sample_collection_json_emoji

2023-12-03 03:17:41,185 INFO  [main] index.IndexCollection (IndexCollection.java:491) - 1 file found

2023-12-03 03:17:41,185 INFO  [main] index.IndexCollection (IndexCollection.java:492) - Starting to index...

2023-12-03 03:17:41,227 DEBUG [pool-2-thread-1] index.IndexCollection$LocalIndexerThread (IndexCollection.java:315) - sample_collection_json_emoji/doc.json: 1 docs added.

2023-12-03 03:17:41,272 INFO  [main] index.IndexCollection (IndexCollection.java:548) - Indexing Complete! 1 documents indexed

2023-12-03 03:17:41,272 INFO  [main] index.IndexCollection (IndexCollection.java:549) - ============ Final Counter Values ============

2023-12-03 03:17:41,272 INFO  [main] index.IndexCollection (IndexCollection.java:550) - indexed:                1

2023-12-03 03:17:41,272 INFO  [main] index.IndexCollection (IndexCollection.java:551) - unindexable:            0

2023-12-03 03:17:41,272 INFO  [main] index.IndexCollection (IndexCollection.java:552) - empty:                  0

2023-12-03 03:17:41,273 INFO  [main] index.IndexCollection (IndexCollection.java:553) - skipped:                0

2023-12-03 03:17:41,273 INFO  [main] index.IndexCollection (IndexCollection.java:554) - errors:                 0

2023-12-03 03:17:41,276 INFO  [main] index.IndexCollection (IndexCollection.java:557) - Total 1 documents indexed in 00:00:00

100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3204/3204 [00:02<00:00, 1322.96it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3204/3204 [00:02<00:00, 1391.07it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3209.11it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 4339.68it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3462.08it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3204/3204 [00:02<00:00, 1350.72it/s]

100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3204/3204 [00:02<00:00, 1461.64it/s]

...Attempting to initialize pre-encoded queries ance-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-ance-msmarco-passage-dev-subset-20210419-9323ec.adad81bb1495eff2f0463e809ecc01b8 already exists, skipping download.

Initializing ance-msmarco-passage-dev-subset...

Attempting to initialize pre-encoded queries ance-dl19-passage.

/Users/sueszli/.cache/pyserini/queries/query-embedding-ance-dl19-passage-20230124-99b79.828714ef5481dc49686e14b61881ba06 already exists, skipping download.

Initializing ance-dl19-passage...

Attempting to initialize pre-encoded queries ance-dl20.

/Users/sueszli/.cache/pyserini/queries/query-embedding-ance-dl20-passage-20230124-99b79.79acea9812a5c20d0d0817b07b348d15 already exists, skipping download.

Initializing ance-dl20...

.Attempting to initialize pre-encoded queries distilbert_kd-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_kd-msmarco-passage-dev-subset-20210419-9323ec.4706ec91183eefa9771e9311fe4799e0 already exists, skipping download.

Initializing distilbert_kd-msmarco-passage-dev-subset...

Attempting to initialize pre-encoded queries distilbert_kd-dl19-passage.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_kd-dl19-passage-20230124-99b79.c9fe8c8112a7d4fcda1aa606af77e66a already exists, skipping download.

Initializing distilbert_kd-dl19-passage...

Attempting to initialize pre-encoded queries distilbert_kd-dl20.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_kd-dl20-passage-20230124-99b79.09fe19984515145a78183a98e44bd699 already exists, skipping download.

Initializing distilbert_kd-dl20...

.Attempting to initialize pre-encoded queries distilbert_tas_b-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_dot_tas_b_b256-msmarco-passage-dev-subset-20210527-63276f.17a3f81de7ba497728050b83733b1c46 already exists, skipping download.

Initializing distilbert_tas_b-msmarco-passage-dev-subset...

Attempting to initialize pre-encoded queries distilbert_tas_b-dl19-passage.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_dot_tas_b_b256-dl19-passage-20230124-99b795.a0a23a1be77e6e9e5dfacf32dfcd5e9b already exists, skipping download.

Initializing distilbert_tas_b-dl19-passage...

Attempting to initialize pre-encoded queries distilbert_tas_b-dl20.

/Users/sueszli/.cache/pyserini/queries/query-embedding-distilbert_dot_tas_b_b256-dl20-passage-20230124-99b795.8ffb4d5a17a2c028fb5065ef8a394ab3 already exists, skipping download.

Initializing distilbert_tas_b-dl20...

.Attempting to initialize pre-encoded queries tct_colbert-msmarco-doc-dev.

/Users/sueszli/.cache/pyserini/queries/query-embedding-tct_colbert-msmarco-doc-dev-20210419-9323ec.565fe57f92b229643b68fa3263f089a9 already exists, skipping download.

Initializing tct_colbert-msmarco-doc-dev...

.Attempting to initialize pre-encoded queries sbert-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-sbert-msmarco-passage-dev-subset-20210419-9323ec.dc0d09a0f5803824c1ad46a39417aa1e already exists, skipping download.

Initializing sbert-msmarco-passage-dev-subset...

.Attempting to initialize pre-encoded queries tct_colbert-v2-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-tct_colbert-v2-msmarco-passage-dev-subset-20210608-5f341b.ee8d76e596aef02c5027a2ffd0ff66f8 already exists, skipping download.

Initializing tct_colbert-v2-msmarco-passage-dev-subset...

.Attempting to initialize pre-encoded queries tct_colbert-v2-hn-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-tct_colbert-v2-hn-msmarco-passage-dev-subset-20210608-5f341b.f7e39cf2cd3ee53f7f8f2e0a1821431c already exists, skipping download.

Initializing tct_colbert-v2-hn-msmarco-passage-dev-subset...

.Attempting to initialize pre-encoded queries tct_colbert-v2-hnp-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-tct_colbert-v2-hnp-msmarco-passage-dev-subset-20210608-5f341b.bed8036475774d12915c8af2a44612f4 already exists, skipping download.

Initializing tct_colbert-v2-hnp-msmarco-passage-dev-subset...

...........................................................................................................................................................[0.004s][warning][os,thread] Attempt to protect stack guard pages failed (0x000000016dcec000-0x000000016dcf8000).

[0.004s][warning][os,thread] Attempt to deallocate stack guard pages failed.

Running tests/resources/nfcorpus-queries.tsv topics, saving to run.9033202.txt...

  0%|                                                                                                                    | 0/5 [00:00<?, ?it/s]

faiss: timed out

the faiss test just completely times out

**pyserini** git:master  

(pyserini) ❯ python -m pyserini.search.faiss \

    --topics msmarco-passage-dev-subset \

    --index msmarco-v1-passage.tct_colbert-v2-hnp \

    --encoded-queries tct_colbert-v2-hnp-msmarco-passage-dev-subset \

    --threads 12 --batch-size 384 \

    --output run.msmarco-passage.tct_colbert-v2.bf.tsv \

    --output-format msmarco && python -m pyserini.eval.msmarco_passage_eval msmarco-passage-dev-subset run.msmarco-passage.tct_colbert-v2.bf.tsv

Using pre-defined topic order for msmarco-passage-dev-subset

Attempting to initialize pre-encoded queries tct_colbert-v2-hnp-msmarco-passage-dev-subset.

/Users/sueszli/.cache/pyserini/queries/query-embedding-tct_colbert-v2-hnp-msmarco-passage-dev-subset-20210608-5f341b.bed8036475774d12915c8af2a44612f4 already exists, skipping download.

Initializing tct_colbert-v2-hnp-msmarco-passage-dev-subset...

Attempting to initialize pre-built index msmarco-v1-passage.tct_colbert-v2-hnp.

/Users/sueszli/.cache/pyserini/indexes/faiss.msmarco-v1-passage.tct_colbert-v2-hnp.20210608.5f341b.53bcaa78ab0ca629f3379b8aa00eb3ae already exists, skipping download.

Initializing msmarco-v1-passage.tct_colbert-v2-hnp...

Running msmarco-passage-dev-subset topics, saving to run.msmarco-passage.tct_colbert-v2.bf.tsv...

  0%|                                                                                                                 | 0/6980 [00:00<?, ?it/s]