Steven Wang's starred repositories

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:28400Issues:0Issues:0

notepad--

一个支持windows/linux/mac的文本编辑器,目标是做**人自己的编辑器,来自**。

Language:C++License:GPL-3.0Stargazers:5395Issues:0Issues:0

phonemizer

Simple text to phones converter for multiple languages

Language:PythonLicense:GPL-3.0Stargazers:1168Issues:0Issues:0

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Language:PythonLicense:NOASSERTIONStargazers:412Issues:0Issues:0

mongolian-nlp

Useful resources for Mongolian NLP

Language:Jupyter NotebookStargazers:168Issues:0Issues:0

SenseVoice

Multilingual Voice Understanding Model

Language:PythonLicense:NOASSERTIONStargazers:1885Issues:0Issues:0

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:3368Issues:0Issues:0

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonLicense:MITStargazers:432Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:3555Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:PythonLicense:NOASSERTIONStargazers:81032Issues:0Issues:0

speech-synthesis-paper

List of speech synthesis papers.

License:MITStargazers:978Issues:0Issues:0
Language:PythonStargazers:420Issues:0Issues:0

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:9996Issues:0Issues:0

Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Language:CLicense:Apache-2.0Stargazers:767Issues:0Issues:0

sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer

Language:CLicense:NOASSERTIONStargazers:11219Issues:0Issues:0

MMCSG

This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.

Language:PythonLicense:NOASSERTIONStargazers:24Issues:0Issues:0

numpy_exercises

Numpy exercises.

Language:PythonLicense:MITStargazers:1695Issues:0Issues:0

RIR-Generator

Generating room impulse responses

Language:C++License:MITStargazers:412Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonLicense:MITStargazers:10677Issues:0Issues:0

stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Language:PythonLicense:MITStargazers:1436Issues:0Issues:0

jsalt2020_simulate

Training data simulation

Language:PythonLicense:Apache-2.0Stargazers:35Issues:0Issues:0

Beamforming-for-speech-enhancement

simple delaysum, MVDR and CGMM-MVDR

Language:PythonStargazers:222Issues:0Issues:0

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookLicense:MITStargazers:565Issues:0Issues:0

gss

A simple package for Guided source separation (GSS)

Language:PythonLicense:MITStargazers:100Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19307Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:12759Issues:0Issues:0

machine-learning-roadmap

A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.

License:MITStargazers:7425Issues:0Issues:0

Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

Language:HTMLStargazers:11576Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:30251Issues:0Issues:0

ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Language:PythonLicense:MITStargazers:311Issues:0Issues:0