Giuseppe Totaro's repositories
ctakes-clinical-pipeline
Clinical Pipeline Engine using Apache cTAKES
cTAKES_test
Collection of code and scripts to run Apache cTAKES against clinical text
CTAKESContentHadler
This is a preliminary work to add support for Apache cTAKES to Apache Tika
lucene-ir-engine
An extremely simple IR Engine based on Apache Tika and Apache Lucene for indexing and searching heterogeneous documents.
create_seed
Bash script that creates a URL seed list with URLs included in a generic file
get_junoirtf
Automatic collection of Jupiter observations from the NASA Infrared Telescope Facility
StandardsExtractingContentHandler
ContentHandler that helps to extract standard references while parsing with Apache Tika
StringsParser
Preliminary work for the Strings Parser.
ISATabParser
Tika parsers for ISA-Tab data formats
practical-python
Practical Python Programming (course by @dabeaz)
sparkler
Spark-Crawler : Evolving Apache Nutch to run on Spark.
tor-relay-bootstrap
Script to bootstrap a Debian server to be a set-and-forget Tor relay