alephpi

润心's starred repositories

multinomial_diffusion

Language:Python19500

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Language:Jupyter NotebookAGPL-3.0236000

v2rayA

A web GUI client of Project V which supports VMess, VLESS, SS, SSR, Trojan, Tuic and Juicity protocols. 🚀

Language:GoAGPL-3.01070200

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0403400

vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Language:Jupyter NotebookApache-2.0758700

slidev-theme-frankfurt

A theme for Slidev, inspired by the Frankfurt theme in Beamer.

Language:Vue1800

yutto

:ice_cube: 一个可爱且任性的 B 站视频下载器（bilili V2）

Language:PythonGPL-3.093800

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language:PythonMIT189700

media-get

Get the media through the url

Language:GoApache-2.025100

JianZiPu

A font for writing Guqin music in JianZiPu.

Language:JavaScriptMIT1400

FontDiffuser

[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Language:Python24800

Font-diff

Language:Python9700

SDT

This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR23).

Language:PythonMIT95400

Python-Wrapper-for-World-Vocoder

A Python wrapper for the high-quality vocoder "World"

Language:CythonMIT71700

World

A high-quality speech analysis, manipulation and synthesis system

Language:C++NOASSERTION116100

pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

Language:PythonMIT27300

WenetSpeech

A 10000+ hours dataset for Chinese speech recognition

Language:ShellApache-2.048600

BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

Language:HTMLApache-2.0777900

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT337000

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION718700

Sapphire-TTS-Collection

Language:Python800

eindex

Multidimensional indexing for tensors

Language:Jupyter Notebook10700

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonMIT829600

torchcrepe

Pytorch implementation of the CREPE pitch tracker

Language:PythonMIT39400

ppgs

High-Fidelity Neural Phonetic Posteriorgrams

Language:PythonMIT6900

OCR_DataSet

收集并整理有关OCR的数据集并统一标注格式，以便实验需要

Language:Python85800

librime-lua

Extending RIME with Lua scripts

Language:C++BSD-3-Clause29700

ibus-rime.AppImage

Language:Shell1800

css10

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Language:HTMLApache-2.045600

zm-text-tts

[IJCAI'23] Learning to Speak from Text for Low-Resource TTS

Language:PythonApache-2.06300