Shuchang Zhou's starred repositories
libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Assemble-Them-All
[SIGGRAPH Asia 2022] Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly
get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
AEC-Challenge
AEC Challenge
improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
lightplane
Lightplane implements a highly memory-efficient differentiable radiance field renderer, and a module for unprojecting features from images to 3D grids.
soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Scrapegraph-ai
Python scraper based on AI
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
demucs_batch-multigpu
[Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation
emo-visual-data
😜 表情包视觉数据集,使用glm-4v、step-1v的图像解析能力标注。
mujoco_menagerie
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.