Shuchang Zhou (zsc)

zsc

Geek Repo

Location:Beijing

Home Page:https://zsc.github.io/

Github PK Tool:Github PK Tool


Organizations
megvii-research

Shuchang Zhou's starred repositories

stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Language:PythonLicense:MITStargazers:1431Issues:0Issues:0
Language:PythonStargazers:15Issues:0Issues:0

SenseVoice

Multilingual Voice Understanding Model

Language:PythonLicense:NOASSERTIONStargazers:1716Issues:0Issues:0

NKF-AEC

Acoustic Echo Cancellation with Nerual Kalman Filtering

Language:HTMLStargazers:201Issues:0Issues:0

optimize-and-reduce

A Top-Down Approach for Image Vectorization

Language:Jupyter NotebookLicense:MITStargazers:5Issues:0Issues:0

mateo-demo

MAchine Translation Evaluation Online (MATEO)

Language:PythonLicense:GPL-3.0Stargazers:15Issues:0Issues:0

ChartFormer

ChartFormer: A Large Vision Language Model for Converting Chart Images into Tactile Accessible SVGs

Language:PythonStargazers:3Issues:0Issues:0

SketchVideo

[EG 2023] Sketch Video Synthesis

Language:Jupyter NotebookStargazers:195Issues:0Issues:0

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Language:PythonStargazers:418Issues:0Issues:0

image2svg-awesome

All about image tracing and vectorization—the conversion of a raster image (jpg/png) to a vector image (svg).

License:MITStargazers:168Issues:0Issues:0

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonLicense:Apache-2.0Stargazers:2689Issues:0Issues:0

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:2529Issues:0Issues:0

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

Stargazers:168Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:3497Issues:0Issues:0

Whisper-WebUI

A Web UI for easy subtitle using whisper model.

Language:PythonLicense:Apache-2.0Stargazers:830Issues:0Issues:0

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonLicense:MITStargazers:1776Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

PyTorch-SVGRender

SVG Differentiable Rendering: Generating vector graphics using neural networks. Support: text-to-SVG, Image-to-SVG, SVG Editing.

Language:PythonLicense:MPL-2.0Stargazers:96Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10592Issues:0Issues:0

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonLicense:Apache-2.0Stargazers:380Issues:0Issues:0

bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.

Language:PythonLicense:MITStargazers:619Issues:0Issues:0

IMS-Toucan

Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.

Language:PythonLicense:Apache-2.0Stargazers:1294Issues:0Issues:0

English-to-IPA

Converts English text to IPA notation

Language:PythonLicense:MITStargazers:355Issues:0Issues:0

python-pinyin

汉字转拼音(pypinyin)

Language:PythonLicense:MITStargazers:4776Issues:0Issues:0

BigCiDian

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Language:PythonStargazers:252Issues:0Issues:0

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonLicense:MITStargazers:8814Issues:0Issues:0

pinyin-to-ipa

Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.

Language:PythonLicense:MITStargazers:25Issues:0Issues:0

WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Language:PythonLicense:MITStargazers:1607Issues:0Issues:0

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookLicense:MITStargazers:3625Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1839Issues:0Issues:0