emsrc / algraeph

a GUI for manual alignment of graph pairs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

================================================================================

	ALGRAEPH README

================================================================================

Algraeph
Version 1.0

Copyright (C) 2007-2013 by Erwin Marsi and TST-Centrale

https://github.com/emsrc/algraeph
e.marsi@gmail.com


--------------------------------------------------------------------------------
	DESCRIPTION
--------------------------------------------------------------------------------

Algraeph is a tool for manual alignment of linguistic graphs, such as
phrase structure trees or dependency structures, where each node
corresponds to a subsequence of the analyzed input sentence. It allows
you to express the similarity between two graphs by aligning their
nodes and attaching relation labels to these aligments. 

Graphs are read from one or more graphbanks (or treebanks). Algraeph
currently supports graphs in the general GraphML format and in the
Alpino format (for Dutch). Alignment relations are user-defined. The
alignments are stored in a simple XML format, which can be used for
further processing.  The result - a parallel graph corpus - is a
useful data set for many tasks in computational linguistics and
natural language processing such as automatic summarization, automatic
translation, paraphrase extraction, recognizing textual entailment,
etc.

Algraeph is implemented in the Python programming language using the
wxPython GUI toolkit. It has been tested on Mac OS X, GNU Linux and MS
Windows, but should run on any platform which is supported by Python,
wxPython and Graphviz.


--------------------------------------------------------------------------------
	LICENSE & USAGE
--------------------------------------------------------------------------------

Algraeph is licensed under the GNU Public License. For detailed
license information see the file COPYING

Algraeph is provided free of charge. In return I would like to ask the
following. In technical or scientfic publications about research in
which Algraeph was used, please refer to the most appropriate paper
from the list below:

    Erwin Marsi and Emiel Krahmer, "Annotating a parallel monolingual
    treebank with semantic similarity relations". In: Proceeding of
    the Sixth International Workshop on Treebanks and Linguistic
    Theories, December 7-8, 2007, Bergen, Norway.

    Erwin Marsi and Emiel Krahmer, "Explorations in Sentence Fusion". In:
    Proceedings of the 10th European Workshop on Natural Language
    Generation, Aberdeen, GB, 8-10 August 2005,  pages 109-117.

    Erwin Marsi and Emiel Krahmer, "Classification of semantic
    relations by humans and machines". In: Proceedings of the ACL 2005
    workshop on Empirical Modeling of Semantic Equivalence and
    Entailment, University of Michigan, Ann Arbor, USA, 2005, pages
    1-6.

In other cases of commercial or educational use, please link or refer
to the Algraeph webpage https://github.com/emsrc/algraeph


--------------------------------------------------------------------------------
	INSTALLATION
--------------------------------------------------------------------------------

For installation instruction see the file INSTALL.


--------------------------------------------------------------------------------
	USAGE
--------------------------------------------------------------------------------

For documentation see the Algraeph User Manual located under the Help
menu, or in the file doc/algraeph-user-manual.htm.


--------------------------------------------------------------------------------
	DATA
--------------------------------------------------------------------------------

The "data" directory contains a number of examples, where each subdirectory
contains a parallel graph corpus with its graphbanks(s):
	
	data
	|-- alpino
	|   `-- saint
	|       |-- algraeph-saint-1_15.xml
	|       |-- saintaltena-syntax.xml
	|       `-- sainthamel-syntax.xml
	`-- graphml
		|-- rte
		|   |-- algraeph_RTE2_dev-dep-SUM.xml
		|   `-- graphbank_RTE2_dev-dep-SUM.xml
		`-- simple
			|-- algraeph-simple-alignment.xml
			|-- simple-dutch-graphbank.xml
			`-- simple-english-graphbank.xml
	
The corpus algraeph-simple-alignment.xml is a minimal example, consisting of
just 2 sentence pairs of English sentences and their Dutch translations, that
illustrates how similar nodes can be aligned in phrase structure trees encoded
in GraphML.

The corpus algraeph-saint-1_15.xml is a small sample from the Daeso corpus,
which is currently under developement. The text material consists of the first
15 sentences from two alternative Dutch translations of the book "Le Petit
Prince" by Antoine de Saint-Exupéry
(http://en.wikipedia.org/wiki/Le_petit_prince). The parse trees were produced
by the Alpino parser (http://www.let.rug.nl/vannoord/alp/Alpino/). Nodes were
manually aligned and labeled using the set of five semantic relations
described in [Marsi & Krahmer 2005].

The corpus algraeph_RTE2_dev-dep-SUM.xml is derived from 200 text-hypothesis
pairs from the second Pascal Challenge on Recognizing Textual Entailment
(http://www.pascal-network.org/Challenges/RTE2/), in particular the
"summarization" task from the developement set. Sentences were parsed with the
Minipar parser (http://www.cs.ualberta.ca/~lindek/minipar.htm) and converted
to GraphML format. The first 10 sentence pairs are alignned according to the
same five relations mentioned earlier. The idea is that of "tree inclusion"
(implicit in many approaches to RTE): the text (tree on the left) entails the
hypothesis (tree on the right), if (nearly) all nodes in the hypothesis are
aligned and labeled as "equals", "restates" or "specifies".


--------------------------------------------------------------------------------
	CONTACT
--------------------------------------------------------------------------------

For questions, bug reports or feature requests, please contact 
Erwin Marsi at e.marsi@gmail.com.


--------------------------------------------------------------------------------
	ACKNOWLEDGEMENTS
--------------------------------------------------------------------------------

This software was developed within the DAESO research project
(http://daeso.uvt.nl) funded by the Stevin programme
(http://taalunieversum.org/taal/technologie/stevin/)

Algraeph's existence would be much harder without:

- the Python programming language (http://www.python.org/)
- the wxPython GUI toolkit (http://www.wxpython.org/)
- the Graphviz graph visualization software (http://www.graphviz.org/)
- the NetworkX Python package for creating and manipulating graphs and networks 
  (https://networkx.lanl.gov/)
- the Wingware's Python IDE (http://www.wingware.com/)
- the packaging programs 
   py2app (http://svn.pythonmac.org/py2app/py2app/trunk/doc/index.html), 
   py2exe (http://www.py2exe.org/), and 
   Inno Setup (http://www.jrsoftware.org/isinfo.php)
- the feedback from annotators Bas Bareman, Fleur van Dongen, Nienke Eckhardt, 
  Koen van Lierop, Vera Nijveld, Hanneke Schoormans, 

 










About

a GUI for manual alignment of graph pairs

License:GNU General Public License v3.0


Languages

Language:Python 92.7%Language:JavaScript 7.2%Language:Shell 0.1%