ShannonAI / mrc-for-flat-nested-ner

Code for ACL 2020 paper `A Unified MRC Framework for Named Entity Recognition`

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

a bug in mrc.utils.py

scarydemon2 opened this issue · comments

if len(query_tokens)+2+offset_idx_dict[int(s_idx)] <= max_tokens_for_doc and \

may be the "max_tokens_for_doc" should be replaced by "max_seq_length". Because the "doc span pos" matrix is limited by the max_seq_length

I find this bug too and agree with you. I think it should be
if len(query_tokens)+2+offset_idx_dict[int(s_idx)] <= max_seq_length and \
or
if offset_idx_dict[int(s_idx)] <= max_tokens_for_doc and \ .

Apologies for the late reply.

Thanks for pointing out my mistake.
Yes, this is a bug introduced when I was trying to clean my codebase.
I fixed it in the commit (f80ed26). Please pull the latest repo.

Many Thanks!