Shuchang Zhou (zsc)

zsc

Geek Repo

Location:Beijing

Home Page:https://zsc.github.io/

Github PK Tool:Github PK Tool


Organizations
megvii-research

Shuchang Zhou's starred repositories

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:28343Issues:169Issues:417

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14324Issues:115Issues:375

Scrapegraph-ai

Python scraper based on AI

Language:PythonLicense:MITStargazers:13534Issues:91Issues:188

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:6956Issues:61Issues:145

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookLicense:MITStargazers:3625Issues:73Issues:96

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1763Issues:21Issues:179

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Language:PythonLicense:MITStargazers:1607Issues:29Issues:158

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1597Issues:24Issues:46

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1444Issues:31Issues:16

Memary

The Memory Layer For Autonomous Agents

Language:Jupyter NotebookLicense:MITStargazers:1173Issues:13Issues:28

soundstorm-pytorch

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Language:PythonLicense:MITStargazers:1148Issues:51Issues:15

suno-api

Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.

Language:TypeScriptLicense:LGPL-3.0Stargazers:1007Issues:30Issues:105

PuLID

Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Language:PythonLicense:Apache-2.0Stargazers:1005Issues:38Issues:46

improved-aesthetic-predictor

CLIP+MLP Aesthetic Score Predictor

Language:PythonLicense:Apache-2.0Stargazers:812Issues:6Issues:10
Language:PythonLicense:Apache-2.0Stargazers:629Issues:29Issues:17

ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

CraftsMan

CraftsMan: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner

ScreenAI

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

Language:PythonLicense:MITStargazers:245Issues:8Issues:3

lightplane

Lightplane implements a highly memory-efficient differentiable radiance field renderer, and a module for unprojecting features from images to 3D grids.

Language:PythonLicense:NOASSERTIONStargazers:233Issues:25Issues:3

TokenHMR

[CVPR 2024] TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Language:PythonLicense:NOASSERTIONStargazers:177Issues:14Issues:10

libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Language:PythonLicense:Apache-2.0Stargazers:162Issues:6Issues:6

Assemble-Them-All

[SIGGRAPH Asia 2022] Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly

Language:C++License:MITStargazers:130Issues:8Issues:20
Language:PythonLicense:NOASSERTIONStargazers:103Issues:2Issues:2

get-haized

A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.

myo_sim

Musculoskeletal Models in MuJoCo

Language:PythonLicense:Apache-2.0Stargazers:71Issues:8Issues:8
Language:PythonLicense:Apache-2.0Stargazers:64Issues:1Issues:0

pinyin-to-ipa

Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.

Language:PythonLicense:MITStargazers:25Issues:3Issues:3

demucs_batch-multigpu

[Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonLicense:MITStargazers:18Issues:0Issues:0