Beast code in Giters

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Language:CApache-2.076700

sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer

Language:CNOASSERTION1121900

MMCSG

This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.

Language:PythonNOASSERTION2400

numpy_exercises

Numpy exercises.

Language:PythonMIT169500

RIR-Generator

Generating room impulse responses

Language:C++MIT41200

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonMIT1067700

stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Language:PythonMIT143600

jsalt2020_simulate

Training data simulation

Language:PythonApache-2.03500

Beamforming-for-speech-enhancement

simple delaysum, MVDR and CGMM-MVDR

Language:Python22200

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookMIT56500

gss

A simple package for Guided source separation (GSS)

Language:PythonMIT10000

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT1930700

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1275900

machine-learning-roadmap

A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.

MIT742500

Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

Language:HTML1157600

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT3025100

ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Language:PythonMIT31100

wxy1988

Steven Wang's starred repositories

LLaMA-Factory

notepad--

phonemizer

UniSpeech

mongolian-nlp

SenseVoice

CosyVoice

SLAM-LLM

silero-vad

pytorch

speech-synthesis-paper

TeleSpeech-ASR

LoRA

Whisper-Finetune