lujiale621's starred repositories
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stable-Hair
Stable-Hair: Real-World Hair Transfer via Diffusion Model
How-to-use-Transformers
Transformers 库快速入门教程
Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
IMAGDressing
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
fish-speech
Brand new TTS solution
SoniTranslate
Synchronized Translation for Videos. Video dubbing
StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
video-mamba-suite
The suite of modeling video with Mamba
sherpa-onnx
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
wesubtitle
用 OCR 提取视频硬字幕
ShareGPT4Video
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
BilibiliSummary
A chrome extension helps you summary video on bilibili.
GPT-SoVITS-Inference
Inference Specialization
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
RTranslator
Open source real-time translation app for Android that runs locally