Tuan Tran's repositories
Event-Detection
DBSCAN Algorithm in Map/Reduce logic, implemented with Hadoop and MongoDB, to analyze tweets and photos and to create geolocated events
JGibbLabeledLDA
Labeled LDA in Java (based on JGibbLDA)
aida
AIDA Named Entity Disambiguation by the Databases and Information Systems Group at the Max Planck Institute for Informatics.
climf
collaborative less-is-more filtering
dataanalysis
Coursera data analysis course, done in Python
elasticsearch-river-wikipedia
Wikipedia River Plugin for ElasticSearch
git
Git Source Code Mirror - This is a publish-only repository and all pull requests are ignored. Please follow Documentation/SubmittingPatches procedure for any of your improvements.
j-google-trends-api
Java based implementation of Unofficial Google Trends API
kraken
Github mirror of Wikimedia analytics data services platform (analytics/kraken) β our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing https://gerrit.wikimedia.org/r/#/admin/projects/analytics/kraken
latex
This contains latex resources for my publications
likelike
likelike
LuceneBoostExamplePart1
Example Code for part one of the Imaginea blog boost series
matrix-hadoop-tutorial
A set of tutorial codes about matrix methods in Hadoop
mss
All Maximal Scoring Subsequences algorithm
nerdml
NERD Machine Learner
stats.grok.se
Code for aggregating wikipedia traffic statistics
stream-lib
Stream summarizer and cardinality estimator.
sumtract
Second project for UW LING 572. Automatic text summarization system.
trec-kba
This project contains some Hadoop code for working with the TREC Knowledge Base Acceleration dataset. In particular, it provides classes to read/write topic files, read/write run files, and expose the documents in the Thrift files as Hadoop-readable objects.
twitter-tools
Twitter Tools
twitter_nlp
UW Twitter NLP Tools
wikientities
Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts
wikipedia-irc
Try to spot new trends based on Wikipedia live edit spikes
word2vec-query-expansion
An Apache Lucene TokenFilter that uses a word2vec vectors for term expansion.