vincentliuheyang

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" （DragGAN 全功能实现，在线Demo，本地部署试用，代码、模型已全部开源，支持Windows, macOS, Linux）

Language:Python499300

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:Python963900

SadTalker

[CVPR 2023] SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Language:PythonNOASSERTION1101300

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonApache-2.0595200

OpenDelta

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Language:PythonApache-2.095100

backgroundremover

Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source.

Language:PythonMIT640700

rembg

Rembg is a tool to remove images background

Language:PythonMIT1520000

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonApache-2.0455200

ActionRoguelike

Third-person Action Roguelike made in Unreal Engine C++. Project for Unreal Engine C++ Course & Stanford University

Language:C++332700

multidiffusion-upscaler-for-automatic1111

Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0

Language:PythonNOASSERTION455800

ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

Language:PythonGPL-3.03947700

FateZero

[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"

Language:Jupyter NotebookMIT106900

CoDeF

[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Language:PythonNOASSERTION478200

IF

Language:PythonNOASSERTION755500

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT2963500

PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

Language:PythonApache-2.022400

VIMA

Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"

Language:PythonMIT70600

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT3348500

GLIGEN

Open-Set Grounded Text-to-Image Generation

Language:PythonMIT187900

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonMIT181100

CogVideo

Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

Language:PythonApache-2.0353700

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonBSD-3-Clause254900