thunlp / HMEAE

Source code for EMNLP-IJCNLP 2019 paper "HMEAE: Hierarchical Modular Event Argument Extraction".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about the special token which indicates event type. Thank you

xixy opened this issue · comments

commented

Thank you for releasing the source code.

I noticed that DMBERT has a special token to indicate the event type when detecting arguments.

To utilize the event type information in our model, we append a special token into each input sequence for BERT to indicate the event type.

Could you give me more details about the operation? Maybe an example is helpful. Take attack event for example, the input may look like the following:

[CLS] [Token1] [Token2] [Token3] [Token4]...[Token 128] [SEP] [ATTACK]

What is the special token? Like [Attack]、#ATTACK#

If the special token doesn't exist in Bert's vocab file, how do you initialize the representation for the token?

Thank you and look forward to your reply.

The special tokens are the original [unusedXXX] tokens in the BERT's vocabulary. For instance, we can use the [unused0] token to indicated the beginning for an ``attack'' event, etc. I have started a temporary repo sharing a recent implementation of DMBERT for ED. Maybe those codes can help.