Olivier Binette's repositories
er-evaluation
An End-to-End Evaluation Framework for Entity Resolution Systems
groupbyrule
Deduplicate data using fuzzy and deterministic matching rules.
simple-typo-tolerant-search
Efficient typo-tolerant search in 76 lines of code, with no dependencies.
streamlit-survey
Survey components for Streamlit apps
Awesome-LLMs-Evaluation-Papers
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
deepchecks
Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
duckdb
DuckDB is an in-process SQL OLAP Database Management System
facets
Visualizations for machine learning datasets
FeatureStore-lite
A lightweight feature store for Pandas, DuckDB, or your choice of backend.
giskard
🐢 The testing framework for ML models, from tabular to LLMs
HandsOnEntityResolution
This repository accompanies the early release of Hands On Entity Resolution
mismo
The SQL/Ibis powered sklearn of record linkage
PatentsView-Evaluation
Evaluation and benchmarking of PatentsView disambiguation algorithms
RMarkdown-Reproducibility-Template
Template for a reproducible RMarkdown document
seisbench
SeisBench - A toolbox for machine learning in seismology
streamlit-example
Example Streamlit app that you can fork to test out share.streamlit.io
trubrics-sdk
Validate your ML models and collect human feedback with Trubrics
TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
ul-benchmark-datasets-for-entity-resolution-archive
Unofficial archive of https://dbs.uni-leipzig.de/research/projects/benchmark-datasets-for-entity-resolution