jasonwongw

jasonwongw

Geek Repo

Github PK Tool:Github PK Tool

jasonwongw's starred repositories

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4182Issues:0Issues:0

HDTF

the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"

Language:PythonLicense:GPL-3.0Stargazers:80Issues:0Issues:0

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Language:PythonLicense:Apache-2.0Stargazers:2916Issues:0Issues:0

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonLicense:NOASSERTIONStargazers:613Issues:0Issues:0

jetson_avatar

AI-Powered Photorealistic Talking Avatar

Language:PythonStargazers:10Issues:0Issues:0

ChatWaifu_Mobile

移动版二次元 AI 老婆聊天器

Language:C++License:MITStargazers:1198Issues:0Issues:0

StyleFlow

StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)

Language:PythonStargazers:2411Issues:0Issues:0

VideoReTalking-HQ

VideoReTalking-HQ is a high-quality video retalking tool for enhancing and synchronizing video frames with audio inputs using advanced face enhancement techniques, including GFPGAN, and expression control.

Language:PythonStargazers:13Issues:0Issues:0

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonLicense:Apache-2.0Stargazers:6038Issues:0Issues:0

PerVFI

Official code base of "Perception-Oriented Video Frame Interpolation via Asymmetric Blending" (CVPR 2024), also denoted as ''PerVFI''.

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

Video-Infinity

Video-Infinity generates long videos quickly using multiple GPUs without extra training.

Language:PythonStargazers:106Issues:0Issues:0

blsp-emo

BLSP-Emo: Towards Empathetic Large Speech-Language Models

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:PythonStargazers:9727Issues:0Issues:0

Awesome-ChatTTS

官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project

Stargazers:614Issues:0Issues:0

PantoMatrix

PantoMatrix: Co-Speech Talking Head and Gestures Generation

Language:PythonLicense:NOASSERTIONStargazers:872Issues:0Issues:0

hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Language:PythonLicense:MITStargazers:6004Issues:0Issues:0

TaleCrafter

[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters

Stargazers:247Issues:0Issues:0

Yulan-GARDEN

Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"

Language:PythonStargazers:38Issues:0Issues:0
Language:PythonStargazers:72Issues:0Issues:0

ToonCrafter

a research paper for generative cartoon interpolation

Language:PythonLicense:Apache-2.0Stargazers:4725Issues:0Issues:0

ChatTTS-ui

一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Language:PythonLicense:NOASSERTIONStargazers:5087Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:NOASSERTIONStargazers:27061Issues:0Issues:0

FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.

Language:TypeScriptLicense:NOASSERTIONStargazers:15246Issues:0Issues:0

Steerable-Motion

A ComfyUI node for driving videos using batches of images.

Language:PythonLicense:NOASSERTIONStargazers:750Issues:0Issues:0

MagicTime

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Language:PythonLicense:Apache-2.0Stargazers:1213Issues:0Issues:0

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonLicense:NOASSERTIONStargazers:1846Issues:0Issues:0

srs

SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.

Language:C++License:MITStargazers:24756Issues:0Issues:0

RWKV-Runner

A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.

Language:TypeScriptLicense:MITStargazers:4775Issues:0Issues:0

AI-Writer

AI 写小说,生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型,类似 GPT-2 。AI写作。RWKV for Chinese novel generation.

Language:PythonLicense:Apache-2.0Stargazers:2800Issues:0Issues:0

style2paints

sketch + style = paints :art: (TOG2018/SIGGRAPH2018ASIA)

Language:JavaScriptLicense:Apache-2.0Stargazers:17886Issues:0Issues:0