Stefan Grafberger's repositories
mlinspect-cidr
Inspect ML Pipelines in Python in the form of a DAG (CIDR Submission version)
csvmatch
🔎 Finds fuzzy matches between CSV spreadsheets
dedupe
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
hackathon-2021-1
Rust Rust Rust!
latex-make-action
Action for compiling latex with make
learnedcardinalities
Code and workloads from the Learned Cardinalities paper (https://arxiv.org/abs/1809.00677)
jenga
Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptions (e.g., missing values, broken character encodings) on the prediction quality of their ML models.
ml-pipeline-datasets
Some datasets for ML pipelines that I want to use for some experiments
mlinspect-exploratory-user-study
The files for an initial exploratory user study. It provides the foundation for a larger user study in future work.
noworkflow
Supporting infrastructure to run scientific experiments without a scientific workflow management system.
pgbm
Probabilistic Gradient Boosting Machines
st-cytoscape
A Fork to add dagre layout support