David Berenstein's repositories
classy-classification
This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
davidberenstein1957
π¨π½βπ³ Cooking, π¨π½βπ» Coding, π Committing
awesome-synthetic-data
A curated list of awesome synthetic data tools (open source and commercial).
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
model2vec-setfit
Model2Vec: Distill a Small Fast Model from any Sentence Transformer
political-statement-evaluator
Work for De Groene Amsterdammer, a Dutch opinion newspaper, to investigate political statements made during talkshows.
scratchpad
Random projects concepts and ideas.
awesome-open-source-ai-tools
Awesome Open Source AI Tools
cookbook
Open-source AI cookbook
datasets-llama-index
A public repo that contains integrations for Hugging Face datasets and LlamaIndex.
diffusers
π€ Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
GLiNER.cpp
C++ inference engine for running GLiNER (Generalist and Lightweight Named Entity Recognition) models
gradio
Build and share delightful machine learning apps, all in Python. π Star to support our work!
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
huggingface.js
Use Hugging Face with JavaScript
ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
langfuse-python
πͺ’ Langfuse Python SDK - Instrument your LLM app with decorators or low-level SDK and get detailed tracing/observability. Works with any LLM or framework
llama_index
LlamaIndex is a data framework for your LLM applications
logits-processor-zoo
A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.
pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.
semhash
Fast Semantic Text Deduplication
smol-course
A course on aligning smol models.
smolagents
π€ smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.