Sheng Zhao's repositories
D-Guard-NLP
Anti-fraud text classification. Aim to apply NLP technology to combat various forms of fraud, particularly phone scams.
data_server
临时服务
lobe-chat
🤯 Lobe Chat - an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Perplexity / Bedrock / Azure / Mistral / Ollama ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT chat application.
audio-preprocess
Preprocess Audio for training
D-Guard-TTS
[IJCAI2023] [DADA2023] Track 1.1 Champion. TTS/Voice Clone
downkyi
哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。https://t.me/+7zeNbdkP0TEzODll
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain
lip-reading-model
chinese-lip-reading
MirrorSite
镜像网站合集
Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
SpeechAlgorithms
Speech Algorithms
speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
ssspy
A Python toolkit for sound source separation.
stable-diffusion-webui
Stable Diffusion web UI
tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages