Sheng Zhao (nuaazs)

nuaazs

Geek Repo

Company:Nanjing University of Aeronautics and Astronautics

Location:Pavia , Italy

Github PK Tool:Github PK Tool

Sheng Zhao's repositories

VAF_2

Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

Language:PythonStargazers:406Issues:5Issues:0

AskLLM

AI-driven, adaptive customer service agent.

Language:PythonStargazers:5Issues:0Issues:0

D-Guard-NLP

Anti-fraud text classification. Aim to apply NLP technology to combat various forms of fraud, particularly phone scams.

Language:PythonStargazers:3Issues:2Issues:0

ScanNetAI

ScanNetAI is an advanced self-supervised deep learning model tailored for CT image analysis. It excels in processing large-scale CT data, offering superior performance in tasks like image segmentation, medical image conversion, and dosage prediction.

Language:Jupyter NotebookStargazers:3Issues:3Issues:0

blogs

blogs and notes

nuaazs

zhaosheng@nuaa.edu.cn

wav_utils

A collection of telephony channel audio processing tools. #Voiceprint #Speaker Recognition

Language:ShellStargazers:2Issues:1Issues:0

data_server

临时服务

Language:PythonStargazers:1Issues:1Issues:0

gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

License:AGPL-3.0Stargazers:1Issues:0Issues:0

lobe-chat

🤯 Lobe Chat - an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Perplexity / Bedrock / Azure / Mistral / Ollama ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT chat application.

License:MITStargazers:1Issues:0Issues:0
License:MITStargazers:1Issues:0Issues:0

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

License:MITStargazers:1Issues:0Issues:0

audio-preprocess

Preprocess Audio for training

License:Apache-2.0Stargazers:0Issues:0Issues:0

D-Guard-TTS

[IJCAI2023] [DADA2023] Track 1.1 Champion. TTS/Voice Clone

Stargazers:0Issues:2Issues:0

downkyi

哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。https://t.me/+7zeNbdkP0TEzODll

License:GPL-3.0Stargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

License:MITStargazers:0Issues:0Issues:0

Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain

License:Apache-2.0Stargazers:0Issues:0Issues:0

lip-reading-model

chinese-lip-reading

Language:PythonStargazers:0Issues:0Issues:0

MirrorSite

镜像网站合集

Stargazers:0Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

License:Apache-2.0Stargazers:0Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

License:NOASSERTIONStargazers:0Issues:0Issues:0

RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"

License:MITStargazers:0Issues:0Issues:0

SpeechAlgorithms

Speech Algorithms

Language:CLicense:Apache-2.0Stargazers:0Issues:0Issues:0

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

License:MITStargazers:0Issues:0Issues:0

ssspy

A Python toolkit for sound source separation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

stable-diffusion-webui

Stable Diffusion web UI

License:AGPL-3.0Stargazers:0Issues:0Issues:0

tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

License:MITStargazers:0Issues:0Issues:0

tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Visual_Speech_Recognition_for_Multiple_Languages

Visual Speech Recognition for Multiple Languages

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0