luyug / Condenser

EMNLP 2021 - Pre-training architectures for dense retrieval

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the title of the Wikipedia during pretraining

shunyuzh opened this issue · comments

Hi Luyu, @luyug

Thanks for your interesting work!

My question is how the title is used during your pre-training, especially in coCondenser pre-training?

Is it just a span in the list of spans?

{'spans': List[str]} ...

你好,请问你是如何拆分spans,按照句子级还是段落级的