TGOWLeR

TGOWLeR system abstracts general patterns from workflow sequences previously extracted from texts. It comprises two modules –a workflow extractor and a pattern miner– both relying on a specific domain ontology.

Prerequisites

JAVA 1.8: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
PYHTON 2.7 https://www.python.org/download/releases/2.7
TOMCAT 8.0 https://tomcat.apache.org/download-80.cgi
SESAME OpenRDF 2.7.16 https://sourceforge.net/projects/sesame/files/Sesame%202/2.7.16
GATE 8.1 https://gate.ac.uk/download/

Input/output data

The phylogenetic ontology PHAGE is available in: http://bioportal.bioontology.org/ontologies/PHAGE
Texts, extracted workflows and generated patterns are available in: https://data.world/growler/datasets

Tools

WfExtractor_1.0: Worklflow Extractor on Gate 8.1

The WfExtractor_1.0 tool annotates a text corpus with its phylgoenetic analyses workflows. Some of the features of WfExtractor_1.0 are:

Extract workflow components (programs, parameters, data and metadata) from texts
Extract data flows (relations) from texts
Create a WSD (Word Sense Disambiguation) models for both components and relations
Export Gate Inline XML corpus.

Installation:

1. Unzip the WfExtractor_1.0.zip file. 2. Import all files from $WfExtractor_HOME/plugins to the $GATE_HOME/plugins directory 3. Load the PHAGE ontology via tomcat (see Sesame deployment guide) 4. If JAVA reports an error please configure the $TOMCAT_HOME/bin/catalina.sh file to prevent Entity Expansion Attacks: ``` JAVA_OPTS="$JAVA_OPTS -Djdk.xml.entityExpansionLimit=100000000 -Djdk.xml.FEATURE_SECURE_PROCESSING=false -Xmx6G"``` 5. Configure the Gazetteer_LKB dictionary configuration file $WfExtractor_HOME/application-resources/Dictionary_from_remote_repository/config.ttl with changing the ontology information:``` hr:repositoryURL \< YOUR_HTTP_REPOSITORY \>" rep:repositoryID "[YOUR_REPOSITORY_ID]" rdfs:label "[YOUR_REPOSITORY_LABEL]"``` 6. Open Gate and import the application file WfExtractor1.0.xgapp from $WfExtractor_HOME 7. Run the application (see Gate 8.1 Developer Guide).

WfMiner_1.1: Worklflow Pattern Miner and Rule Recommender

WfMiner_1.1 mines abstract closed patterns and generate associations from XML worklfow sequence files and a specific domain ontology.

Installation:

Launch the bowlUtil_0.5 tool and transform the OWL ontology into a binary one (see the README file in $WFMINER_HOME/bowlUtil/). Bowl tranformation is used to speed up the mining process and load a lighter version of the ontology. Note: please use the bowl version of the ontology from the input data (above) to skip this step and don't forget to download the Gene Ontology (owl version)
Launch the WfMiner miner using the following code on your shell (see the README file in WFMINER_HOME/): java -jar java -jar[PATH_TO]/OntoPattern16.jar "[minSupp]" "[PATH_TO]/[bowl_file]" "[PATH_TO]/[train_set]" "[namespace]" "[PATH_TO]/[test_set]" "[topkItems]" "[topnRules]" "[min_ontology_level]"

Other T-GOWLer tools

WfTransformer_1.0:

This tool transforms the Gate inline XML workflows into sequences of events (encoded in a simple XML tree).

WfSimulator_1.0:

This tool simulates phylogenetic workflows using instances encoded in the ontolog PHAGE. Using apriori abstract patterns provided by an expert to guide workflow reconstruction. The simulator is based on a Montre Carlo simulation fixing a number of parameters each run to generate event sequences.

Contact

For any technical issues, please e-mail admin: halioui.ahmed@uqam.ca

http://labo.bioinfo.uqam.ca/halioui.php

About

TGROWLeR system abstracts general patterns from workflow sequences previously extracted from texts. It comprises two modules –a workflow extractor and a pattern miner– both relying on a specific domain ontology.

http://labo.bioinfo.uqam.ca/tgowler

Languages

Language:Java 91.0%Language:HTML 4.8%Language:Python 2.4%Language:XSLT 1.5%Language:CSS 0.4%Language:Shell 0.0%