Abhigyan Raman's starred repositories
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
open-webui
User-friendly WebUI for AI (Formerly Ollama WebUI)
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
python-mastery
Advanced Python Mastery (course by @dabeaz)
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
RealtimeTTS
Converts text to speech in realtime
ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
AVeryComfyNerd
ComfyUI related stuff and things
HierSpeechpp
The official implementation of HierSpeech++
speech-synthesis-paper
List of speech synthesis papers.
textbook_quality
Generate textbook-quality synthetic LLM pretraining data
deep-image-matching
Multiview matching with deep-learning and hand-crafted local features for COLMAP and other SfM software. Supports high-resolution formats and images with rotations. Both CLI and GUI are supported.
VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
guidelines
C++ Default Guidelines
PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
gcs-fuse-csi-driver
The Google Cloud Storage FUSE Container Storage Interface (CSI) Plugin.
redis-feast-gcp
A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.