Ethan's repositories
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
actionformer_release
Code release for ActionFormer (ECCV 2022)
AMR-Benchmark
A Unified Implementation of Several Baseline Deep Learning Models for Automatic Modulation Recognition
asr
沪语(上海话)ASR(语音识别)模型
audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
AutoX
AutoX is an efficient automl tool, which is mainly aimed at data mining tasks with tabular data.
bark
🔊 Text-prompted Generative Audio Model
bisheng
Bisheng is an open LLM devops platform for next generation AI applications.
ctc_decoder
A ctc decoder for both online and offline asr model
DecryptPrompt
总结Prompt&LLM论文,开源数据&模型,AIGC应用
HierSpeechpp
The official implementation of HierSpeech++
HowToLiveLonger
程序员延寿指南 | A programmer's guide to live longer
kws
An End-to-End Architecture for Keyword Spotting and Voice Activity Detection
LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
phkit
phoneme toolkit. 好用的音素处理工具箱,包含中文音素、英文音素、文本转拼音、文本正则化等模块。
Pix2Text
Pix In, Latex & Text Out. Recognize Chinese, English Texts, and Math Formulas from Images.
Rerender_A_Video
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
simple_ddp_test
toy code for ddp test
SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
StyleTTS
Official Implementation of StyleTTS
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit