Liumeng Xue (lmxue)

lmxue

Geek Repo

Company:Northwestern Polytechnical University

Location:Xi'an, ShannXi

Home Page:https://lmxue.github.io/

Github PK Tool:Github PK Tool

Liumeng Xue's starred repositories

screenshot-to-code

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Language:PythonLicense:MITStargazers:55577Issues:322Issues:284

WeChatMsg

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Language:PythonLicense:GPL-3.0Stargazers:32330Issues:170Issues:395

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:21237Issues:178Issues:440

MoneyPrinterTurbo

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Language:PythonLicense:MITStargazers:15634Issues:132Issues:358

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:12696Issues:169Issues:507

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:11124Issues:161Issues:250

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonLicense:Apache-2.0Stargazers:7513Issues:96Issues:1524

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7343Issues:89Issues:120

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:7122Issues:61Issues:302

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7066Issues:63Issues:147

ComfyUI-Workflows-ZHO

我的 ComfyUI 工作流合集 | My ComfyUI workflows collection

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Language:PythonLicense:Apache-2.0Stargazers:4414Issues:62Issues:177

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4377Issues:57Issues:146

ThinkDSP

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.

Language:Jupyter NotebookStargazers:3875Issues:236Issues:57

resemble-enhance

AI powered speech denoising and enhancement

Language:PythonLicense:MITStargazers:1183Issues:16Issues:39

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:MITStargazers:1146Issues:57Issues:50

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:1074Issues:26Issues:72

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonLicense:Apache-2.0Stargazers:914Issues:44Issues:406

WavCraft

Official repo for WavCraft, an AI agent for audio creation and editing

Language:PythonLicense:NOASSERTIONStargazers:648Issues:71Issues:1

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

spear-tts-pytorch

Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch

Language:PythonLicense:MITStargazers:249Issues:28Issues:6

frechet-audio-distance

A lightweight library for Frechet Audio Distance calculation.

Language:PythonLicense:MITStargazers:224Issues:2Issues:12

Codec-SUPERB

Audio Codec Speech processing Universal PERformance Benchmark

openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend

Language:RustLicense:MITStargazers:131Issues:6Issues:14

SpeechTasks

This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.

VoicePAT

VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.

Language:ShellLicense:Apache-2.0Stargazers:46Issues:5Issues:5

Open-Suno

trying to reproduce suno v3

License:MITStargazers:23Issues:3Issues:0

tarzan

High-level API for tar-based dataset

Language:PythonStargazers:10Issues:3Issues:0