William Mattingly's starred repositories
dataset-cards
Smithsonian Dataset Cards
opinionated
Opinionated provides simple, clean stylesheets for plotting with matplotlib and seaborn.
open-interpreter
A natural language interface for computers
streamlit-drawable-canvas
Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.
craft-text-detector
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
weaviate-filter
A package for creating GraphQL filters for Weaviate
st-weaviate-connection
A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database
bagpipes-spacy
Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.
number-spacy
Number spaCy is a custom spaCy pipeline component that enhances the identification of number entities in text and fetches the parsed numeric values using spaCy's token extensions.
keyword-spacy
Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.
sidatasciencelab.github.io
A quarto based website for the Smithsonian Data Science Lab
spacy-setfit
This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.
how-to-ingest-pdfs-with-unstructured
Ingest PDFs into Weaviate
ehri-data-analysis-tools
Miscellaneous data analysis tools and scripts for the EHRI project
quivr
Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework