Meixu Song (songmeixu)

songmeixu

Geek Repo

0

followers

0

following

0

stars

Company:Speech@AI, Xiaomi

Location:Beijing, China

Home Page:https://songmeixu.github.io/blog

Github PK Tool:Github PK Tool


Organizations
open-speech

Meixu Song's starred repositories

hello-algo

《Hello 算法》:动画图解、一键运行的数据结构与算法教程,支持 Python, C++, Java, C#, Go, Swift, JS, TS, Dart, Rust, C, Zig 等语言。English edition ongoing

Language:JavaLicense:NOASSERTIONStargazers:74834Issues:442Issues:161

ChatGPT-Next-Web

A well-designed cross-platform ChatGPT UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT 应用。

Language:TypeScriptLicense:MITStargazers:53692Issues:325Issues:2004

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:37814Issues:377Issues:1560

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:32378Issues:304Issues:404

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:31578Issues:334Issues:280

intel-one-mono

Intel One Mono font repository

onnx-simplifier

Simplify your onnx model

Language:C++License:Apache-2.0Stargazers:3542Issues:50Issues:290

Vulkan-Hpp

Open-Source Vulkan C++ API

Language:C++License:Apache-2.0Stargazers:2904Issues:115Issues:537

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:2788Issues:39Issues:180

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonLicense:MITStargazers:2236Issues:61Issues:166

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:1557Issues:33Issues:149

VulkanSamples

Vulkan Samples

Language:C++License:NOASSERTIONStargazers:1354Issues:116Issues:100

mycroft-precise

A lightweight, simple-to-use, RNN wake word listener

Language:PythonLicense:Apache-2.0Stargazers:793Issues:33Issues:189

BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

Language:C++License:Apache-2.0Stargazers:745Issues:35Issues:230

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language:PythonLicense:MITStargazers:695Issues:23Issues:69

Native_SDK

C++ cross-platform 3D graphics SDK. Includes demos & helper code (resource loading etc.) to speed up development of Vulkan, OpenGL ES 2.0 & 3.x applications

Language:C++License:MITStargazers:659Issues:96Issues:61

VulkanTools

Tools to aid in Vulkan development

Language:C++License:NOASSERTIONStargazers:633Issues:45Issues:552

optimizer

Actively maintained ONNX Optimizer

Language:C++License:Apache-2.0Stargazers:587Issues:28Issues:63

Vulkan-Loader

Vulkan Loader

Language:CLicense:NOASSERTIONStargazers:467Issues:66Issues:452

FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时), 所以识别效果也很好,可以媲美许多商用的ASR软件。

Language:CLicense:Apache-2.0Stargazers:433Issues:22Issues:68

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Language:PythonLicense:MITStargazers:405Issues:29Issues:27

speech-denoiser

A speech denoise lv2 plugin based on RNNoise library

Language:CLicense:LGPL-3.0Stargazers:281Issues:14Issues:20

BigCiDian

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

Language:CLicense:Apache-2.0Stargazers:233Issues:9Issues:27

vulkan-sdk

Github repository for the Vulkan SDK

Language:CLicense:NOASSERTIONStargazers:220Issues:39Issues:24

vits_chinese

vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统

Language:PythonStargazers:202Issues:3Issues:0

cobra

On-device voice activity detection (VAD) powered by deep learning

Language:PythonLicense:Apache-2.0Stargazers:140Issues:11Issues:21

3m-asr

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Language:PythonLicense:Apache-2.0Stargazers:115Issues:6Issues:5

koala

On-device noise suppression powered by deep learning

Language:PythonLicense:Apache-2.0Stargazers:51Issues:12Issues:6

sherpa-mnn

Real-time speech recognition using next-gen Kaldi with MNN without Internet connection

Language:C++License:Apache-2.0Stargazers:5Issues:3Issues:0