stevezheng23 / xlnet_extension_tf

XLNet Extension in TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about InputFeature generation in coqa

silencio94 opened this issue · comments

Thanks for uploading good and readable code and experiment setting, result.

btw, i have question about CoQA InputFeature generation.
In inputFeature generation code,
I think that your code seems to assume that doc span has always rationale to answer Free-form answers.
image

That means, sometimes, when doc span has no clue to answer free-form type question, it can be labeled incorrectly.
Is it intended? or Is there anything else I haven't understood?
Thanks for your work and have a good day!

Thanks, @silencio94 !

Although you have closed this issue, I'd like to explain a little bit more on your question. When pre-processing the context longer than max_seq_len (e.g. 512, etc.), it will be sliced into multiple sub-contexts with max_seq_len, and some of them will contain no answer. If we can't find free-form answer in the sub-context, the answer start/end will be intentionally labeled as first token (which is the [CLS] token)

Best,
Xiaoming

If you feel happy, please star me :)

I've been reading some CQA codes lately, and I'm probably confused with them 😂. Thank you for kind explanation!