OlivierBinette

Olivier Binette's repositories

er-evaluation

An End-to-End Evaluation Framework for Entity Resolution Systems

Language:PythonAGPL-3.0700

groupbyrule

Deduplicate data using fuzzy and deterministic matching rules.

Language:PythonGPL-3.07 1 4

VisTree

Language:Jupyter NotebookAGPL-3.06 1 12

TessTools

Tools for the use of Tesseract OCR in R

Language:R4 1 7

olivierbinette.github.io

Language:HTML2 10

simple-typo-tolerant-search

Efficient typo-tolerant search in 76 lines of code, with no dependencies.

Language:PythonApache-2.0200

CSVMeta

Lightweight csv read/write, keeping track of csv dialect and other metadata.

Language:PythonMIT1 10

JSM-2023

ER-Evaluation Demo for JSM 2023

Language:HTMLAGPL-3.01 10

OlivierBinette

100

splink

Implementation in Apache Spark of the EM algorithm to estimate parameters of Fellegi-Sunter's canonical model of record linkage.

Language:PythonMIT100

streamlit-survey

Survey components for Streamlit apps

Language:PythonNOASSERTION100

assignee-search

Language:Python000

Awesome-LLMs-Evaluation-Papers

The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.

000

deepchecks

Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.

Language:PythonNOASSERTION000

duckdb

DuckDB is an in-process SQL OLAP Database Management System

MIT000

facets

Visualizations for machine learning datasets

Apache-2.0000

FeatureStore-lite

A lightweight feature store for Pandas, DuckDB, or your choice of backend.

Language:PythonAGPL-3.0000

FoFo

Language:PythonApache-2.0000

giskard

🐢 The testing framework for ML models, from tabular to LLMs

Language:PythonNOASSERTION000

HandsOnEntityResolution

This repository accompanies the early release of Hands On Entity Resolution

Language:Jupyter Notebook000

LLM-Hamza

Language:Jupyter Notebook010

mismo

The SQL/Ibis powered sklearn of record linkage

Language:PythonLGPL-3.0000

PatentsView-Evaluation

Evaluation and benchmarking of PatentsView disambiguation algorithms

Language:PythonGPL-3.0000

RMarkdown-Reproducibility-Template

Template for a reproducible RMarkdown document

AGPL-3.0010

seisbench

SeisBench - A toolbox for machine learning in seismology

Language:Jupyter NotebookGPL-3.0000

streamlit-example

Example Streamlit app that you can fork to test out share.streamlit.io

Language:Python000

trubrics-sdk

Validate your ML models and collect human feedback with Trubrics

Language:PythonApache-2.0000

trubrics-tests

Language:Python010

TruthfulQA

TruthfulQA: Measuring How Models Imitate Human Falsehoods

Apache-2.0000

ul-benchmark-datasets-for-entity-resolution-archive

Unofficial archive of https://dbs.uni-leipzig.de/research/projects/benchmark-datasets-for-entity-resolution

Language:HTML000