Beast code in Giters

JungwonChang's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0133175 1118 15890

yt-dlp

A feature-rich command-line audio/video downloader

Language:PythonUnlicense84439 502 7850

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT69201 5750

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.047061 305 663

FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git

Language:CNOASSERTION45343 14400

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonApache-2.019127 280 2917

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonMIT17799 156 1274

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.016059 109 1053

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08690 133 1089

Yi

A series of large language models trained from scratch by developers @01-ai

Language:Jupyter NotebookApache-2.07623 106 291

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT6084 71 990

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonMIT4129 50 230

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Language:PythonMIT3556 65 103

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonApache-2.02229 44 397

Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)

Language:PythonMIT2187 70 208

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonMIT1829 21 181

KoAlpaca

KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델

Language:Jupyter NotebookApache-2.01537 29 99

awesome-whisper

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

CC0-1.01218 220

voca

This codebase demonstrates how to synthesize realistic 3D character animations given an arbitrary speech signal and a static character mesh.

Language:Python1142 42 118

sox

SoX, Swiss Army knife of sound processing

Language:CNOASSERTION698 280

whispering

Streaming transcriber with whisper

Language:PythonMIT683 19 41

MonocularTotalCapture

Code for CVPR19 paper "Monocular Total Capture: Posing Face, Body and Hands in the Wild"

Language:C++662 36 62

ICT-FaceKit

ICT's Vision and Graphics Lab's morphable face model and toolkit

Language:PythonMIT644 35 14

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonNOASSERTION526 32 28

FaceMeshFaceGeometry

FaceMeshFaceGeometry for FaceMesh

Language:JavaScriptMIT401 12 9

community-events

Place where folks can contribute to 🤗 community events

Language:Jupyter Notebook397 52 32

open-korean-instructions

언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.

Language:Python343 50

ctc-segmentation

Segment an audio file and obtain utterance alignments. (Python package)

Language:PythonApache-2.0319 13 29

jtubespeech

Language:PythonApache-2.0211 10 8

Knowledge-Distillation-Toolkit

:no_entry: [DEPRECATED] A knowledge distillation toolkit based on PyTorch and PyTorch Lightning.

Language:PythonApache-2.0136 14 7