Chris Ha's starred repositories
LibreChat
Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
rust-ecosystem
Rust wants & tracking for Embark 🦀
s3-connector-for-pytorch
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
fabricator
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
dkpro-c4corpus
DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate removal, language detection, and near-duplicate removal.
counter-rs
Simple object to count Rust iterables
common-pile
Repo to hold code and track issues for the collection of permissively licensed data
kotlin-ktor-starter
kotlin-ktor-starter
guedou.github.io
GitHub Pages repository for https://guedou.github.io
awesome-data-deduplication
An awesome list of data deduplication use cases, papers, tools, and methods.