xcarson's starred repositories

VideoLingo

Netflix级字幕切割翻译、精确对齐和个性化配音,一键全自动视频搬运

Language:PythonLicense:MITStargazers:787Issues:0Issues:0

Omost

Your image is almost there!

Language:PythonLicense:Apache-2.0Stargazers:7129Issues:0Issues:0
License:MITStargazers:45Issues:0Issues:0

ReHiFace-S

Real Time High-Fidelity Faceswap

Language:PythonLicense:NOASSERTIONStargazers:220Issues:0Issues:0

fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

Language:GoStargazers:21461Issues:0Issues:0

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23885Issues:0Issues:0

Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Language:PythonLicense:AGPL-3.0Stargazers:29753Issues:0Issues:0

EGamePlay

一个基于Entity-Component模式的灵活、通用、可扩展的轻量战斗(技能)框架,配置可选使用ScriptableObject或是Excel表格. A flexible, generic, easy to extend, lightweight combat (skills) framework based on Entity-Component pattern. Configuration can choose to use ScriptableObject or Excel tables.

Language:C#License:MITStargazers:1901Issues:0Issues:0

flux

Official inference repo for FLUX.1 models

Language:PythonLicense:Apache-2.0Stargazers:11870Issues:0Issues:0

AutoLOD

Automatic LOD generation + scene optimization

Language:C#License:NOASSERTIONStargazers:1794Issues:0Issues:0

stack

Open-source Clerk/Auth0 alternative

Language:TypeScriptLicense:NOASSERTIONStargazers:2701Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:3953Issues:0Issues:0

EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Language:PythonLicense:Apache-2.0Stargazers:2101Issues:0Issues:0

wiseflow

Wiseflow is an agile information mining tool that extracts concise messages from various sources such as websites, WeChat official accounts, social platforms, etc. It automatically categorizes and uploads them to the database.

Language:JavaScriptLicense:NOASSERTIONStargazers:3295Issues:0Issues:0

exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Language:PythonLicense:GPL-3.0Stargazers:6030Issues:0Issues:0

mem0

The memory layer for Personalized AI

Language:PythonLicense:Apache-2.0Stargazers:20160Issues:0Issues:0

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:4170Issues:0Issues:0

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonStargazers:933Issues:0Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7102Issues:0Issues:0

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:2381Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:66555Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10628Issues:0Issues:0

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:10818Issues:0Issues:0

pipecat

Open Source framework for voice and multimodal conversational AI

Language:PythonLicense:BSD-2-ClauseStargazers:2894Issues:0Issues:0

pinus

A fast,scalable,distributed game server framework for Node.js, Powered by TypeScript. 一个TypeScript写的node.js分布式游戏/应用服务器框架(原型基于pomelo)。

Language:JavaScriptLicense:MITStargazers:1796Issues:0Issues:0

whatsapp-web.js

A WhatsApp client library for NodeJS that connects through the WhatsApp Web browser app

Language:JavaScriptLicense:Apache-2.0Stargazers:14945Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:31588Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3973Issues:0Issues:0

Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️

Language:PythonLicense:Apache-2.0Stargazers:2196Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:25440Issues:0Issues:0