Rohan Tondulkar's starred repositories
openrouter-runner
Inference engine powering open source models on OpenRouter
HierSpeechpp
The official implementation of HierSpeech++
vectorflow
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
awesome-foundation-and-multimodal-models
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
LibreTranslate
Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.
Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
gpt-researcher
LLM based autonomous agent that conducts in-depth web research on any given topic
insanely-fast-whisper
Incredibly fast Whisper-large-v3
Rerender_A_Video
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.