Matt Henschke's starred repositories
awesome-public-datasets
A topic-centric list of HQ open datasets.
faster-whisper
Faster Whisper transcription with CTranslate2
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
auto-code-rover
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-bench lite and 38.40% tasks (pass@1) in SWE-bench verified with each task costs less than $0.7.
machine-learning-with-ruby
Curated list: Resources for machine learning in Ruby
whisper-playground
Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/
pdf-annotate.js
Annotation layer for pdf.js (no longer maintained)
fullstaq-ruby-docker
Docker image for Ruby build from Fullstaq packages based on Debian 10, 11, and 12.
elemental_components
Simple view components for Rails 5.1+
sentry-fargate-cf-stack
AWS CloudFormation template to launch a highly-available Sentry 20 stack through ECS Fargate at the minimum cost possible
OCRmyPDF-EasyOCR
OCRmyPDF EasyOCR plugin
clinicalXLNet
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
jquery_query_builder-rails
The jQuery Query Builder Rule Evaluator and JavaScript library + Dependencies ready for the Rails Asset Pipeline
optimistic-json
Ruby implementation of `best-effort-json-parser` to parse potentially incomplete JSON in a best effort manner.