Beast code in Giters

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Language:CApache-2.074200

sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer

Language:CNOASSERTION1101400

MMCSG

This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.

Language:PythonNOASSERTION2200

numpy_exercises

Numpy exercises.

Language:PythonMIT169500

RIR-Generator

Generating room impulse responses

Language:C++MIT40900

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonMIT1021000

stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Language:PythonMIT140400

jsalt2020_simulate

Training data simulation

Language:PythonApache-2.03400

Beamforming-for-speech-enhancement

simple delaysum, MVDR and CGMM-MVDR

Language:Python21800

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookMIT55300

gss

A simple package for Guided source separation (GSS)

Language:PythonMIT9800

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT1914100

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1195800

machine-learning-roadmap

A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.

MIT738700

Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

Language:HTML1143500

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT2895500

ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Language:PythonMIT28300

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2019700

wxy1988

Steven Wang's starred repositories

SenseVoice

CosyVoice

SLAM-LLM

silero-vad

pytorch

speech-synthesis-paper

TeleSpeech-ASR

LoRA

Whisper-Finetune