Brian Mount's starred repositories
ScribeWizard
ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3
ml-mobileclip
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
opencap-core
Main OpenCap processing pipeline
antibioticsai
Supporting code for the paper "Discovery of a structural class of antibiotics with explainable deep learning"
screenshot-to-code
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
spin-model-transformers
Physics-inspired transformer modules based on mean-field dynamics of vector-spin models in JAX
self-operating-computer
A framework to enable multimodal models to operate a computer.
AnimateAnyone
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
visual_anagrams
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
awesome-openai-vision-api-experiments
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
realtime-bakllava
llama.cpp with BakLLaVA model describes what does it see
LLaVA-Interactive-Demo
LLaVA-Interactive-Demo