a bug in mrc.utils.py
scarydemon2 opened this issue · comments
Tianhao Gao commented
may be the "max_tokens_for_doc" should be replaced by "max_seq_length". Because the "doc span pos" matrix is limited by the max_seq_length
JaeZheng commented
I find this bug too and agree with you. I think it should be
if len(query_tokens)+2+offset_idx_dict[int(s_idx)] <= max_seq_length and \
or
if offset_idx_dict[int(s_idx)] <= max_tokens_for_doc and \
.
Deleted user commented
Apologies for the late reply.
Thanks for pointing out my mistake.
Yes, this is a bug introduced when I was trying to clean my codebase.
I fixed it in the commit (f80ed26). Please pull the latest repo.
Many Thanks!