Stefan Grafberger (stefan-grafberger)

stefan-grafberger

Geek Repo

Company:University of Amsterdam

Location:Amsterdam

Home Page:https://stefan-grafberger.com

Twitter:@SGrafberger

Github PK Tool:Github PK Tool

Stefan Grafberger's repositories

mlinspect

Inspect ML Pipelines in Python in the form of a DAG

Language:PythonLicense:Apache-2.0Stargazers:68Issues:5Issues:52

mlwhatif

Data-Centric What-If Analysis for Native Machine Learning Pipelines

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14Issues:3Issues:23

StreamDQ

StreamDQ is a library built on top of Apache Flink for defining "unit tests for data", which measure data quality in large data streams.

Language:KotlinLicense:Apache-2.0Stargazers:10Issues:0Issues:0

mlinspect-cidr

Inspect ML Pipelines in Python in the form of a DAG (CIDR Submission version)

Language:PythonLicense:Apache-2.0Stargazers:5Issues:1Issues:1
Language:Jupyter NotebookLicense:GPL-3.0Stargazers:1Issues:3Issues:0

csvmatch

🔎 Finds fuzzy matches between CSV spreadsheets

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

datawig

Imputation of missing values in tables.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

dedupe

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

hackathon-2021-1

Rust Rust Rust!

Language:RustStargazers:0Issues:0Issues:0

latex-make-action

Action for compiling latex with make

Language:MakefileStargazers:0Issues:0Issues:0

learnedcardinalities

Code and workloads from the Learned Cardinalities paper (https://arxiv.org/abs/1809.00677)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

jenga

Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptions (e.g., missing values, broken character encodings) on the prediction quality of their ML models.

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:0Issues:0Issues:0

ml-pipeline-datasets

Some datasets for ML pipelines that I want to use for some experiments

Language:Jupyter NotebookStargazers:0Issues:1Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

mlinspect-exploratory-user-study

The files for an initial exploratory user study. It provides the foundation for a larger user study in future work.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

noworkflow

Supporting infrastructure to run scientific experiments without a scientific workflow management system.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

pgbm

Probabilistic Gradient Boosting Machines

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:KotlinLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

st-cytoscape

A Fork to add dagre layout support

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:KotlinStargazers:0Issues:0Issues:0