vincentliuheyang's starred repositories

Language:JavaScriptLicense:MITStargazers:108Issues:0Issues:0

kitti360Scripts

This repository contains utility scripts for the KITTI-360 dataset.

Language:PythonLicense:MITStargazers:364Issues:0Issues:0

Wuerstchen

Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models

Language:Jupyter NotebookLicense:MITStargazers:510Issues:0Issues:0

YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:5597Issues:0Issues:0

facechain

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Language:PythonLicense:Apache-2.0Stargazers:8594Issues:0Issues:0
Language:Jupyter NotebookLicense:NOASSERTIONStargazers:938Issues:0Issues:0

FaceFormer

[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers

Language:PythonLicense:MITStargazers:749Issues:0Issues:0

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:31428Issues:0Issues:0

DragGAN

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)

Language:PythonStargazers:4993Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:PythonStargazers:9639Issues:0Issues:0

SadTalker

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Language:PythonLicense:NOASSERTIONStargazers:11013Issues:0Issues:0

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonLicense:Apache-2.0Stargazers:5952Issues:0Issues:0

OpenDelta

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Language:PythonLicense:Apache-2.0Stargazers:951Issues:0Issues:0

backgroundremover

Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source.

Language:PythonLicense:MITStargazers:6407Issues:0Issues:0

rembg

Rembg is a tool to remove images background

Language:PythonLicense:MITStargazers:15200Issues:0Issues:0

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4552Issues:0Issues:0

ActionRoguelike

Third-person Action Roguelike made in Unreal Engine C++. Project for Unreal Engine C++ Course & Stanford University

Language:C++Stargazers:3327Issues:0Issues:0

multidiffusion-upscaler-for-automatic1111

Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0

Language:PythonLicense:NOASSERTIONStargazers:4558Issues:0Issues:0

ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

Language:PythonLicense:GPL-3.0Stargazers:39477Issues:0Issues:0

FateZero

[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"

Language:Jupyter NotebookLicense:MITStargazers:1069Issues:0Issues:0

CoDeF

[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Language:PythonLicense:NOASSERTIONStargazers:4782Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:7555Issues:0Issues:0

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29635Issues:0Issues:0

PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

Language:PythonLicense:Apache-2.0Stargazers:224Issues:0Issues:0

VIMA

Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"

Language:PythonLicense:MITStargazers:706Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33485Issues:0Issues:0

GLIGEN

Open-Set Grounded Text-to-Image Generation

Language:PythonLicense:MITStargazers:1879Issues:0Issues:0

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonLicense:MITStargazers:1811Issues:0Issues:0

CogVideo

Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

Language:PythonLicense:Apache-2.0Stargazers:3537Issues:0Issues:0

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2549Issues:0Issues:0