Wooseok Han (hwRG)

hwRG

Geek Repo

Company:@AITRICS

Location:SEOUL, REPUBLIC OF KOREA

Home Page:https://hwrg.github.io/

Github PK Tool:Github PK Tool

Wooseok Han's starred repositories

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonLicense:MITStargazers:1642Issues:0Issues:0

bigvsan

Pytorch implementation of BigVSAN

Language:PythonLicense:MITStargazers:183Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:454Issues:0Issues:0

promptbase

All things prompt engineering

Language:PythonLicense:MITStargazers:5158Issues:0Issues:0

ConfidenceIntervals

Confidence interval computation for evaluation in machine learning using the bootstrapping approach

Language:Jupyter NotebookLicense:MITStargazers:55Issues:0Issues:0

resource-stream

CUDA related news and material links

License:MITStargazers:878Issues:0Issues:0

WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Language:PythonLicense:MITStargazers:1334Issues:0Issues:0

UnitSpeech

An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:121Issues:0Issues:0

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonLicense:MITStargazers:523Issues:0Issues:0
Language:PythonLicense:CC-BY-4.0Stargazers:231Issues:0Issues:0

titanet

Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO

Language:Jupyter NotebookLicense:MITStargazers:54Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5252Issues:0Issues:0

Whispering-LLaMA

EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction

Language:Jupyter NotebookLicense:MITStargazers:199Issues:0Issues:0

ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Language:Jupyter NotebookLicense:MITStargazers:50Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

Stargazers:9657Issues:0Issues:0

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:2223Issues:0Issues:0

whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4145Issues:0Issues:0

NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Language:PythonLicense:MITStargazers:593Issues:0Issues:0

NoiseTorch

Real-time microphone noise suppression on Linux.

Language:GoLicense:NOASSERTIONStargazers:9034Issues:0Issues:0

whisper-onnx-tensorrt

ONNX and TensorRT implementation of Whisper

Language:PythonLicense:MITStargazers:47Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonLicense:MITStargazers:9420Issues:0Issues:0

Awesome-Korean-Speech-Recognition

한국어 음성인식 STT API 리스트. 각 성능 벤치마크.

License:CC0-1.0Stargazers:275Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:19870Issues:0Issues:0

text-generation-webui-colab

A colab gradio web UI for running Large Language Models

Language:Jupyter NotebookLicense:UnlicenseStargazers:2052Issues:0Issues:0

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1198Issues:0Issues:0

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1215Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:44Issues:0Issues:0

gomin

GOMIN; Gaudio Open Mel-spectrogram Inversion Network

Language:PythonLicense:MITStargazers:109Issues:0Issues:0

train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Language:PythonLicense:MITStargazers:619Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8738Issues:0Issues:0