cltl / EventStoryLine

Materials for the StoryLine extraction task - annotated data, baselines and evaluation scripts, evaluation data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EventStoryLine

This repository contains the following materials associated with the StoryLine extraction Task:

  • annotated data in CAT-XML format (folder: annotated_data). To visualise the data, you have to use CAT (Content Annotation Tool: http://dh.fbk.eu/resources/cat-content-annotation-tool). Ask for a n account, it's free.
  • annotated data in evaluation format, extending PLOT_LINK relations to include coreference relations (folder: evaluation_format)
  • test data (folder: evaluation_format/test)
  • Python3.* scripts for creating the evaluation format of the data, extracting baselines systems, evaluating baselines'output

The corpus is still growing. Different versions will be made available in this repository as soon as they are ready. Reference papers:

Caselli, T. and P. Vossen. 2016. The Storyline Annotation and Representation Scheme (StaR): A Proposal. In Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016). Held in conjunction with EMNLP 2016 Caselli, T. and P. Vossen. 2017. The Event StoryLine Corpus: A New Benchmark for Causal and Temporal Relation Extraction. In Proceedings of the Events and Stories in the News (EventStory 2017). Held in conjunction with ACL 2017

Experiments reported in Caselli and Vossen 2017 use version 0.9 of the corpus.

Version 1.0 is available.

About

Materials for the StoryLine extraction task - annotated data, baselines and evaluation scripts, evaluation data.

License:Other


Languages

Language:Python 100.0%