douzi0248

douzi's starred repositories

Tianchi-LLM-QA

阿里天池: 2023全球智能汽车AI挑战赛——赛道一：AI大模型检索问答 baseline 80+

Language:Python7100

BetterMixture-Top1-Solution

天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案

Language:Python2200

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:Jupyter NotebookMIT681100

edgeai-tidl-tools

Edgeai TIDL Tools and Examples - This repository contains Tools and example developed for Deep learning runtime (DLRT) offering provided by TI’s edge AI solutions.

Language:PythonNOASSERTION13500

open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Language:SvelteMIT4485800

qwen2-sft

Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理

Language:PythonApache-2.04300

SummerTTS-python

Language:C++MIT300

SummerTTS

SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目，可以本地运行不需要网络，而且没有额外的依赖，一键编译完成即可用于中文和英文的语音合成。SummerTTS is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out

Language:C++40000

tts-demo

支持各种感情的男女声音，支持实时和离线文本合成tts语音；支持单模特声音变声，语音速率调整，语音音量大小调整；支持自定义语音模型。

Language:Java5700

ollama-python

Ollama Python library

Language:PythonMIT436700

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Language:Jupyter NotebookApache-2.0708700

sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust

Language:C++Apache-2.0347400

OpenVoiceChat

Have a natural voice conversation with an LLM

Language:PythonApache-2.022000

RealtimeSTT_LLM_TTS

实时STT，连接OpenAI接口/智谱AI（流式LLM）和GPT-SOVITS/Edge-TTS，通过网页的方式，进行跨网络的服务调用，实现实时对话的效果

Language:PythonMIT24400

FunAudioLLM-APP

Language:PythonMIT28000

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01243100

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.03509600

OpenVoice

Instant voice cloning by MIT and MyShell.

Language:PythonMIT2957000

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION1369400

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT3492700

ChatTTS

A generative speech model for daily dialogue.

Language:PythonAGPL-3.03195700

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0597500

self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合**宝宝的部署教程

Language:Jupyter NotebookApache-2.0898800

SenseVoice

Multilingual Voice Understanding Model

Language:PythonNOASSERTION325900

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.0414700

faster-whisper-GUI

faster_whisper GUI with PySide6

Language:PythonAGPL-3.0158500

WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Language:PythonMIT198800

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT7054400

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonNOASSERTION675400