David Berenstein's repositories
concise-concepts
This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
classy-classification
This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
fast-sentence-transformers
This repository, called fast sentence transformers, contains code to run 5X faster sentence transformers using tools like quantization and ONNX.
crosslingual-coreference
A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.
spacy-setfit
This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.
paper-prowler
A versatile tool that aggregates and organizes web content like RSS feeds, arXiv papers to streamline your research and information management.
davidberenstein1957
👨🏽🍳 Cooking, 👨🏽💻 Coding, 🏆 Committing
davidberenstein1957.github.io
A repo dedicated to creating a public webpage for me, myself and I.
LocalAIVoiceChat
Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with Coqui XTTS for synthesis.
argilla
✨Argilla: the open-source data curation platform for LLMs
autotrain-advanced
🤗 AutoTrain Advanced
data-is-better-together
Let's build better datasets, together!
fact-checking-rocks
Fact checking baseline combining dense retrieval and textual entailment
fastfit
FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
hub-docs
Docs of the Hugging Face Hub
LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
outlines
Generative Model Programming
setfit
Efficient few-shot learning with Sentence Transformers
spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
spacy-fastfit
This repository contains an easy and intuitive approach to use FastFit in combination with spaCy.
spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.
SpanMarkerNER
SpanMarker for Named Entity Recognition
trl
Train transformer language models with reinforcement learning.
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
wikipedia-fact-checker
A fact checker for Wikipedia.