pengyizhou

followers

following

stars

pengyizhou's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT68445 5750

HowToCook

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Language:DockerfileUnlicense66624 402 665

ChatGPT

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

Language:RustAGPL-3.052462 441 1057

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT35517 327 437

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CMIT34775 312 1309

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Language:C++NOASSERTION20218 573 3505

CodeFormer

[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Language:PythonNOASSERTION15399 297 344

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.011692 206 2247

ShiArthur03

Language:MATLABGPL-3.010375 32 1357

automl

Google Brain AutoML

Language:Jupyter NotebookApache-2.06221 151 886

Noi

🚀 Power Your World with AI - Explore, Extend, Empower.

Language:JavaScript6206 82 184

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonNOASSERTION6171 58 1106

BiliBiliToolPro

B 站（bilibili）自动任务工具，支持docker、青龙、k8s等多种部署方式。敏感肌也能用。

Language:C#MIT6143 35 594

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.05212 52 396

multinerf

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Language:PythonApache-2.03622 49 150

FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Language:PythonMIT3366 35 88

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonApache-2.02748 73 82

draw_convnet

Language:Python2637 42 12

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonMIT1819 20 181

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION1418 25 67

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaApache-2.01112 77 379

torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Language:PythonMIT927 11 105

icefall

Language:PythonApache-2.0902 48 649

code-switching-papers

A curated list of research papers and resources on code-switching

Apache-2.0291 24 6

OpenCallBlock

iOS CallKit blocking of NPA-NXX number prefix spam

Language:SwiftMPL-2.074 7 5

XenC

XenC: open-source data selection tool for NLP

Language:HTMLLGPL-3.060 8 7

PASM

Pronunciation-assisted Subword Modeling

Language:Shell29 50

nullscc.github.io

Language:HTMLMIT8 10

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0600

data-selection

Language:Shell1 10