Beast code in Giters

MaisyZhang's starred repositories

coding-interview-university

A complete computer science study plan to become a software engineer.

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT68056 5710

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.011610 205 2242

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonApache-2.06845 71 582

lemon-cleaner

腾讯柠檬清理是针对macOS系统专属制定的清理工具。主要功能包括重复文件和相似照片的识别、软件的定制化垃圾扫描、可视化的全盘空间分析、内存释放、浏览器隐私清理以及设备实时状态的监控等。重点聚焦清理功能，对上百款软件提供定制化的清理方案，提供专业的清理建议，帮助用户轻松完成一键式清理。

Language:Objective-CNOASSERTION5419 50 72

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT3438 57 70

This-repo-has-1426-stars

这个仓库有1426个star，不信你试试

Language:PythonMIT1428 2 20

awesome_lists

Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)

Language:PythonMIT1421 33 1

diart

A python package to build AI-powered real-time audio applications

Language:PythonMIT1006 21 139

av_hubert

A self-supervised learning framework for audio-visual speech

Language:PythonNOASSERTION834 15 111

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

656 18 2

FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

Language:CApache-2.0482 24 70

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonMIT460 13 50

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonApache-2.0452 10 112

BeatNet

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

Language:PythonCC-BY-4.0318 9 27

mega

Sequence modeling with Mega.

Language:PythonMIT297 126 16

awesome-audiovisual-learning

A curated list of audio-visual learning methods and datasets.

221 9 2

Zero_Shot_Audio_Source_Separation

The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022

Language:PythonMIT184 7 19

Spherical-Array-Processing

A collection of MATLAB routines for acoustical array processing on spherical harmonic signals, commonly captured with a spherical microphone array.

Language:MATLABBSD-3-Clause163 12 2

pytorch-revgrad

A minimal pytorch package implementing a gradient reversal layer.

Language:PythonMIT154 3 4

BIRD

Big Impulse Response Dataset

Language:PythonGPL-3.0138 9 2

EasyComDataset

The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.

NOASSERTION102 10 7