Amherst College Data* Mammoths's repositories
datasetgenerator
A porting to modern g++ and C+11 of the IBM Quest dataset generator
parallelcubesampling
Implementations of the parallel and sequential cube sampling algorithms presented in the paper "A Scalable Parallel Algorithm for Balanced Sampling" (Alexander Lee, Stefan Walzer-Goldfeld, Shukry Zablah, Matteo Riondato, AAAI'22 Student Abstract).
SPEck-code
Code for the paper "SPEck: Mining Statistically-significant Sequential Patterns Efficiently with Exact Sampling", by Steedman Jenkins, Stefan Walzer-Goldfeld, and Matteo Riondato, appearing in the Data Mining and Knowledge Discovery Special Issue for ECML PKDD'22.
.github
Special repo.
accsthesis
Amherst College Computer Science Honors Thesis LaTeX template
acdmammoths.github.io
Website for the Amherst College Data* Mammoths Research and Learning Group
Bavarian-code
Code for the paper "Bavarian: Betweenness Centrality Approximation with Variance-Aware Rademacher Averages", by Chloe Wohlgemuth, Cyrus Cousins, and Matteo Riondato, appearing in ACM KDD'21 and ACM TKDD'23
Intellij-Hadoop
Run Hadoop program using Intellij
ROhAN-code
ROhAN: Row-Order Agnostic Null Models for Statistically-sound Knowledge Discovery