Nicola Tonellotto's repositories
bigdata
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
clue
command line tool for Apache Lucene
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++
deeprm
Resource Management with Deep Reinforcement Learning (HotNets '16)
DND-5e-LaTeX-Template
A Small Template For 5e D&D Material
docker-cheat-sheet
Docker Cheat Sheet
FEL
Fast Entity Linker Toolkit for training models to link entities to KnowledgeBase (Wikipedia) in documents and queries.
freebase-triples
A methodology to process triples data from the Freebase data dumps.
generalized-kmeans-clustering
This project generalizes the Spark MLLIB Batch and Streaming K-Means clusterers in every practical way.
google-interview-university
A complete daily plan for studying to become a Google software engineer.
graph-bisection
Dhulipala, Laxman, et al. "Compressing Graphs and Indexes with Recursive Graph Bisection." arXiv preprint arXiv:1602.08820 (2016).
hadoop-docker
Hadoop docker image
homebrewery
Create authentic looking D&D homebrews using only markdown
ircodecs
ircodecs: compresión de enteros aplicada a IR (para Python)
java8-the-missing-tutorial
Java 8 for all of us
JustEnoughScalaForSpark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
LightGBM
A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.
MDPRank
MDP for ranking, policy gradient
modern-cpp-features
A cheatsheet of modern C++ language and library features.
python-lecture
lecture slides for python
RankingComplexLayouts
Repository for SIGIR'18 paper: "Ranking for Relevance and Display Preferences in Complex Presentation Layouts"
RL
A set of RL experiments. Currently including: (1) the MDP rank experiment, based on policy gradient algorithm
SparkMaxFlow
Spark implementation of Ford-Fulkerson algorithm
tagme
Entity Linking system by A3 lab
TextSegmenter
A text segmenter based on unigram/bigram statistics in Java, inspired by the segmenter by Peter Norvig