paramitamirza / CATENA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Query regarding input to CATENA

svjan5 opened this issue · comments

Hi,
I am facing some issues in getting my raw documents to the input format you have specified.

Some specific queries regarding the input format:
|token | token-id | sentence-id | lemma | event-id | event-class | event-tense+aspect+polarity | timex-id | timex-type| timex-value | signal-id | causal-signal-id | pos-tag | chunk | lemma | pos-tag | dependencies | main-verb |

  1. Why you have asked to give same information twice like lemma, pos-tag.
  2. chunk meaning is not intuitive (its description is missing from the wiki)
  3. It is a bit confusing which all attributes are optional in the input.
  4. Is there any standard library to get raw documents in the required format?

I will be thankful to you for resolving these queries.

Thanks in advance
Shikhar