MaisyZhang

MaisyZhang

Geek Repo

Company:@WebPrague

Location:31°N, 121°E

Home Page:https://zhangpeng.ai

Twitter:@Maisy_Zhang

Github PK Tool:Github PK Tool

MaisyZhang's starred repositories

coding-interview-university

A complete computer science study plan to become a software engineer.

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:68056Issues:571Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11610Issues:205Issues:2242

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonLicense:Apache-2.0Stargazers:6845Issues:71Issues:582

lemon-cleaner

腾讯柠檬清理是针对macOS系统专属制定的清理工具。主要功能包括重复文件和相似照片的识别、软件的定制化垃圾扫描、可视化的全盘空间分析、内存释放、浏览器隐私清理以及设备实时状态的监控等。重点聚焦清理功能,对上百款软件提供定制化的清理方案,提供专业的清理建议,帮助用户轻松完成一键式清理。

Language:Objective-CLicense:NOASSERTIONStargazers:5419Issues:50Issues:72

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonLicense:MITStargazers:3438Issues:57Issues:70

This-repo-has-1426-stars

这个仓库有1426个star,不信你试试

Language:PythonLicense:MITStargazers:1428Issues:2Issues:20

awesome_lists

Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)

Language:PythonLicense:MITStargazers:1421Issues:33Issues:1

diart

A python package to build AI-powered real-time audio applications

Language:PythonLicense:MITStargazers:1006Issues:21Issues:139

av_hubert

A self-supervised learning framework for audio-visual speech

Language:PythonLicense:NOASSERTIONStargazers:834Issues:15Issues:111

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时), 所以识别效果也很好,可以媲美许多商用的ASR软件。

Language:CLicense:Apache-2.0Stargazers:482Issues:24Issues:70

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:460Issues:13Issues:50

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonLicense:Apache-2.0Stargazers:452Issues:10Issues:112

BeatNet

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

Language:PythonLicense:CC-BY-4.0Stargazers:318Issues:9Issues:27

mega

Sequence modeling with Mega.

Language:PythonLicense:MITStargazers:297Issues:126Issues:16

awesome-audiovisual-learning

A curated list of audio-visual learning methods and datasets.

Zero_Shot_Audio_Source_Separation

The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022

Language:PythonLicense:MITStargazers:184Issues:7Issues:19

Spherical-Array-Processing

A collection of MATLAB routines for acoustical array processing on spherical harmonic signals, commonly captured with a spherical microphone array.

Language:MATLABLicense:BSD-3-ClauseStargazers:163Issues:12Issues:2

pytorch-revgrad

A minimal pytorch package implementing a gradient reversal layer.

Language:PythonLicense:MITStargazers:154Issues:3Issues:4

BIRD

Big Impulse Response Dataset

Language:PythonLicense:GPL-3.0Stargazers:138Issues:9Issues:2

EasyComDataset

The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.

lcc

llvm-based c compiler

Language:C++License:MITStargazers:92Issues:5Issues:6

torchiva

Blind source separation with independent vector analysis family of algorithm in torch

Language:PythonLicense:MITStargazers:86Issues:5Issues:3

Loss-Gated-Learning

ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'

Language:PythonLicense:MITStargazers:85Issues:3Issues:12

mms_msg

Multipurpose Multi Speaker Mixture Signal Generator

Language:PythonStargazers:43Issues:6Issues:0

CausalityCheck

Causality Check in Frame-online Speech Separation

Language:PythonStargazers:40Issues:2Issues:0

Speech-Simulation-Tools

语音增强领域的相关数据仿真工具和方法汇总--持续更新

Language:HTMLLicense:CC0-1.0Stargazers:31Issues:2Issues:0

AVCleanse

ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'