Beast code in Giters

jasonwongw's starred repositories

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT418200

HDTF

the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"

Language:PythonGPL-3.08000

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Language:PythonApache-2.0291600

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonNOASSERTION61300

jetson_avatar

AI-Powered Photorealistic Talking Avatar

Language:Python1000

ChatWaifu_Mobile

移动版二次元 AI 老婆聊天器

Language:C++MIT119800

StyleFlow

StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)

Language:Python241100

VideoReTalking-HQ

VideoReTalking-HQ is a high-quality video retalking tool for enhancing and synchronizing video frames with audio inputs using advanced face enhancement techniques, including GFPGAN, and expression control.

Language:Python1300

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonApache-2.0603800

PerVFI

Official code base of "Perception-Oriented Video Frame Interpolation via Asymmetric Blending" (CVPR 2024), also denoted as ''PerVFI''.

Language:PythonApache-2.02600

Video-Infinity

Video-Infinity generates long videos quickly using multiple GPUs without extra training.

Language:Python10600

blsp-emo

BLSP-Emo: Towards Empathetic Large Speech-Language Models

Language:PythonApache-2.02600

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:Python972700

Awesome-ChatTTS

官方推荐的 ChatTTS 资源汇总项目，整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project

61400

PantoMatrix

PantoMatrix: Co-Speech Talking Head and Gestures Generation

Language:PythonNOASSERTION87200

hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Language:PythonMIT600400

TaleCrafter

[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters

24700

Yulan-GARDEN

Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"

Language:Python3800

SocialBook-AnimateAnyone

Language:Python7200

ToonCrafter

a research paper for generative cartoon interpolation

Language:PythonApache-2.0472500

ChatTTS-ui

一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Language:PythonNOASSERTION508700

ChatTTS

A generative speech model for daily dialogue.

Language:PythonNOASSERTION2706100

FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.

Language:TypeScriptNOASSERTION1524600

jasonwongw