kang7367's repositories
whisper-timestamped
Multilingual Automatic Speech Recognition with Word-level Timestamps
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
algorithms-book-py
๐๐ ๐ฝ๐๐ฏ๐น๐ถ๐๐ต๐ฒ๐ฑ ๐ฏ๐ผ๐ผ๐ธ ๐ผ๐ป ๐ฝ๐๐๐ต๐ผ๐ป, ๐ฎ๐น๐ด๐ผ๐ฟ๐ถ๐๐ต๐บ๐, ๐ฎ๐ป๐ฑ ๐ฑ๐ฎ๐๐ฎ ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ๐
AwesomeKorean_Data
ํ๊ตญ์ด ๋ฐ์ดํฐ ์ธํธ ๋งํฌ
Bard-API
The unofficial python package that returns response of Google Bard through cookie value.
collaboration
ใ๋ชจ๋์ ๊น&๊นํ๋ธใ (๊ธธ๋ฒ) ์ค์ต ์ ์ฅ์
computing-Korean-STT-error-rates
STT ํ๊ธ ๋ฌธ์ฅ ์ธ์๊ธฐ ์ถ๋ ฅ ์คํฌ๋ฆฝํธ์ ์ธ์ ์ค๋ฅ์จ(CER), ๋จ์ด ์ค๋ฅ์จ(WER)์ ๊ณ์ฐํ๋ Python ํจ์ ํจํค์ง
espnet
End-to-End Speech Processing Toolkit
faster-whisper
Faster Whisper transcription with CTranslate2
flores
Facebook Low Resource (FLoRes) MT Benchmark
hunspell-dict-ko
Korean spellchecking dictionary for Hunspell
Korean-Streaming-ASR
Korean Streaming ASR(with Denoiser and Conformer CTC)
llama
User-friendly LLaMA: Train or Run the model using PyTorch. Nothing else.
open-apis-korea
๐ฐ๐ท ํ๊ตญ์ด ์ฌ์ฉ์๋ฅผ ์ํ ์๋น์ค์ ์ฌ์ฉํ๊ธฐ ์ํ ์คํ API ๋ชจ์
openai-cookbook
Examples and guides for using the OpenAI API
py-hanspell
ํ์ด์ฌ ํ๊ธ ๋ง์ถค๋ฒ ๊ฒ์ฌ ๋ผ์ด๋ธ๋ฌ๋ฆฌ. (๋ค์ด๋ฒ ๋ง์ถค๋ฒ ๊ฒ์ฌ๊ธฐ ์ฌ์ฉ)
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
RSPapers
A Curated List of Must-read Papers on Recommender System.
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
sgmse
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
tensorflow
An Open Source Machine Learning Framework for Everyone
test-repo
My first github repository!
UnitSpeech
An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"
vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
voxpopuli
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
whisper-asr-webservice
OpenAI Whisper ASR Webservice API
youtube-dl
Command-line program to download videos from YouTube.com and other video sites