TGOWLeR
TGOWLeR system abstracts general patterns from workflow sequences previously extracted from texts. It comprises two modules –a workflow extractor and a pattern miner– both relying on a specific domain ontology.
Prerequisites
- JAVA 1.8: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
- PYHTON 2.7 https://www.python.org/download/releases/2.7
- TOMCAT 8.0 https://tomcat.apache.org/download-80.cgi
- SESAME OpenRDF 2.7.16 https://sourceforge.net/projects/sesame/files/Sesame%202/2.7.16
- GATE 8.1 https://gate.ac.uk/download/
Input/output data
- The phylogenetic ontology PHAGE is available in: http://bioportal.bioontology.org/ontologies/PHAGE
- Texts, extracted workflows and generated patterns are available in: https://data.world/growler/datasets
Tools
WfExtractor_1.0: Worklflow Extractor on Gate 8.1
The WfExtractor_1.0 tool annotates a text corpus with its phylgoenetic analyses workflows. Some of the features of WfExtractor_1.0 are:
- Extract workflow components (programs, parameters, data and metadata) from texts
- Extract data flows (relations) from texts
- Create a WSD (Word Sense Disambiguation) models for both components and relations
- Export Gate Inline XML corpus.
1. Unzip the WfExtractor_1.0.zip file. 2. Import all files from $WfExtractor_HOME/plugins to the $GATE_HOME/plugins directory 3. Load the PHAGE ontology via tomcat (see Installation:Sesame deployment guide) 4. If JAVA reports an error please configure the $TOMCAT_HOME/bin/catalina.sh file to prevent Entity Expansion Attacks: ``` JAVA_OPTS="$JAVA_OPTS -Djdk.xml.entityExpansionLimit=100000000 -Djdk.xml.FEATURE_SECURE_PROCESSING=false -Xmx6G"``` 5. Configure the Gazetteer_LKB dictionary configuration file $WfExtractor_HOME/application-resources/Dictionary_from_remote_repository/config.ttl with changing the ontology information:``` hr:repositoryURL \< YOUR_HTTP_REPOSITORY \>" rep:repositoryID "[YOUR_REPOSITORY_ID]" rdfs:label "[YOUR_REPOSITORY_LABEL]"``` 6. Open Gate and import the application file WfExtractor1.0.xgapp from $WfExtractor_HOME 7. Run the application (see Gate 8.1 Developer Guide).
WfMiner_1.1: Worklflow Pattern Miner and Rule Recommender
WfMiner_1.1 mines abstract closed patterns and generate associations from XML worklfow sequence files and a specific domain ontology.
Installation:
- Launch the bowlUtil_0.5 tool and transform the OWL ontology into a binary one (see the README file in $WFMINER_HOME/bowlUtil/). Bowl tranformation is used to speed up the mining process and load a lighter version of the ontology. Note: please use the bowl version of the ontology from the input data (above) to skip this step and don't forget to download the Gene Ontology (owl version)
- Launch the WfMiner miner using the following code on your shell (see the README file in WFMINER_HOME/):
java -jar java -jar[PATH_TO]/OntoPattern16.jar "[minSupp]" "[PATH_TO]/[bowl_file]" "[PATH_TO]/[train_set]" "[namespace]" "[PATH_TO]/[test_set]" "[topkItems]" "[topnRules]" "[min_ontology_level]"
Other T-GOWLer tools
WfTransformer_1.0:
This tool transforms the Gate inline XML workflows into sequences of events (encoded in a simple XML tree).
WfSimulator_1.0:
This tool simulates phylogenetic workflows using instances encoded in the ontolog PHAGE. Using apriori abstract patterns provided by an expert to guide workflow reconstruction. The simulator is based on a Montre Carlo simulation fixing a number of parameters each run to generate event sequences.
Contact
For any technical issues, please e-mail admin: halioui.ahmed@uqam.ca