pengyizhou's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:68445Issues:575Issues:0

HowToCook

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Language:DockerfileLicense:UnlicenseStargazers:66624Issues:402Issues:665

ChatGPT

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

Language:RustLicense:AGPL-3.0Stargazers:52462Issues:441Issues:1057

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:35517Issues:327Issues:437

whisper.cpp

Port of OpenAI's Whisper model in C/C++

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Language:C++License:NOASSERTIONStargazers:20218Issues:573Issues:3505

CodeFormer

[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Language:PythonLicense:NOASSERTIONStargazers:15399Issues:297Issues:344

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11692Issues:206Issues:2247

automl

Google Brain AutoML

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6221Issues:151Issues:886

Noi

🚀 Power Your World with AI - Explore, Extend, Empower.

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:6171Issues:58Issues:1106

BiliBiliToolPro

B 站(bilibili)自动任务工具,支持docker、青龙、k8s等多种部署方式。敏感肌也能用。

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:5212Issues:52Issues:396

multinerf

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Language:PythonLicense:Apache-2.0Stargazers:3622Issues:49Issues:150

FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Language:PythonLicense:MITStargazers:3366Issues:35Issues:88

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonLicense:Apache-2.0Stargazers:2748Issues:73Issues:82

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonLicense:MITStargazers:1819Issues:20Issues:181

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1418Issues:25Issues:67

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaLicense:Apache-2.0Stargazers:1112Issues:77Issues:379

torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Language:PythonLicense:MITStargazers:927Issues:11Issues:105
Language:PythonLicense:Apache-2.0Stargazers:902Issues:48Issues:649

code-switching-papers

A curated list of research papers and resources on code-switching

OpenCallBlock

iOS CallKit blocking of NPA-NXX number prefix spam

Language:SwiftLicense:MPL-2.0Stargazers:74Issues:7Issues:5

XenC

XenC: open-source data selection tool for NLP

Language:HTMLLicense:LGPL-3.0Stargazers:60Issues:8Issues:7

PASM

Pronunciation-assisted Subword Modeling

Language:ShellStargazers:29Issues:5Issues:0
Language:HTMLLicense:MITStargazers:8Issues:1Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0