Shuchang Zhou's starred repositories
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Scrapegraph-ai
Python scraper based on AI
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
WhisperLive
A nearly-live implementation of OpenAI's Whisper.
soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
AEC-Challenge
AEC Challenge
lightplane
Lightplane implements a highly memory-efficient differentiable radiance field renderer, and a module for unprojecting features from images to 3D grids.
libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Assemble-Them-All
[SIGGRAPH Asia 2022] Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly
get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
pinyin-to-ipa
Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.
demucs_batch-multigpu
[Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation