vvictoryuki

Jiwen Yu's starred repositories

Voyager

An Open-Ended Embodied Agent with Large Language Models

Language:JavaScriptMIT541100

DCCM

Compressive Confocal Microscopy Imaging at the Single-Photon Level with Ultra-Low Sampling Ratios (Communications Engineering 2024) [PyTorch]

Language:Python700

Make-A-Protagonist

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts

Language:PythonApache-2.031600

Prompt-Free-Diffusion

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024

Language:PythonMIT71600

Fantasia3D

(ICCV2023) official repository for "Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation"

Language:PythonApache-2.071800

Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, Real-ESRGAN, Real-CUGAN, RTX Video Super Resolution VSR, SRMD, RealSR, Anime4K, RIFE, IFRNet, CAIN, DAIN, and ACNet.

Language:C++NOASSERTION1256500

DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Language:PythonNOASSERTION3561200

MasaCtrl

[ICCV 2023] Consistent Image Synthesis and Editing

Language:PythonApache-2.068700

layered-neural-atlases

Language:PythonMIT58200

All-In-One-Deflicker

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Language:Python67000

StableSR

[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution

Language:PythonNOASSERTION201800

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonNOASSERTION812400

learning_research

本人的科研经验

505300

Personalize-SAM

Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds

Language:PythonMIT147800

threestudio

A unified framework for 3D content generation.

Language:PythonApache-2.0600300

stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

Language:PythonApache-2.0806500

SD-CN-Animation

This script allows to automate video stylization task using StableDiffusion and ControlNet.

Language:PythonMIT80600

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookBSD-3-Clause932300

dolphin

General video interaction platform based on LLMs, including Video ChatGPT

Language:PythonMIT24800

IJCAI2023-CoNR

IJCAI2023 - Collaborative Neural Rendering using Anime Character Sheets

Language:Jupyter NotebookMIT79100

matting_human_datasets

人像matting数据集，包含34427张图像和对应的matting结果图。

NOASSERTION59900

IF

Language:PythonNOASSERTION759200

mmagic

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.

Language:Jupyter NotebookApache-2.0677600

GPT4Tools

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

Language:PythonNOASSERTION74600

Text2Performer

Code for Text2Performer. Paper: Text2Performer: Text-Driven Human Video Generation

Language:PythonNOASSERTION31200

lama

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Language:Jupyter NotebookApache-2.0764200

SadTalker-Video-Lip-Sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形，设置面部区域可配置的增强方式进行合成唇形（人脸）区域画面增强，提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧，补充帧间合成唇形的动作过渡，使合成的唇形更为流畅、真实以及自然。

Language:Python175000

FlowFormer-Official

Language:PythonApache-2.039100

stylegan-t

[ICML'23] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

Language:PythonNOASSERTION114100