Lan Li's repositories
FD_CFD_extraction
Code to extract functional dependencies (FDs) and conditional functional dependencies (CFDs) from data
sherlock-project
This repository provides data and scripts to use Sherlock, a DL-based model for semantic data type detection: https://sherlock.media.mit.edu.
2023-AAAI-KA-ER
This is the paper on knowledge augmentation for entity resolution tasks.
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
CleanML
A Benchmark for Joint Data Cleaning and Machine Learning
Data_Transformation_Algebra
This is the repo for DTA [data cleaning transformation]
google-api-python-client
🐍 The official Python client library for Google's discovery based APIs.
IDCC21-Automatic-Module-Detection
This is paper published in IDCC2021.
lit
The Language Interpretability Tool: Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.
logica
Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.
mermaid
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
mlwhatif
Data-Centric What-If Analysis for Native Machine Learning Pipelines
Multi-BioNER
Cross-type Biomedical Named Entity Recognition with Deep Multi-task Learning (Bioinformatics'19)
OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
PI2
Give me your task query, I will give you an interactive interface app!
tree-sitter
An incremental parsing system for programming tools
Try_JNotebook
This project is to access metadata [run-time execution and cell content] from jupyter notebook