yamand16

Dogucan Yaman's starred repositories

CREMA-D

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

Language:RNOASSERTION34400

HDTF

the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"

Language:PythonGPL-3.034300

yt-dlp

A feature-rich command-line audio/video downloader

Language:PythonUnlicense8331800

av_hubert

A self-supervised learning framework for audio-visual speech

Language:PythonNOASSERTION83300

MDTVSFA

[official] Unified Quality Assessment of In-the-Wild Videos with Mixed Datasets Training (IJCV 2021)

Language:PythonMIT8200

StyleSync

Official code of CVPR '23 paper "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"

Language:Python29000

Deep3DFaceRecon_pytorch

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Language:PythonMIT165600

awesome_talking_face_generation

81200

DINet

The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."

Language:Python96100

Palette-Image-to-Image-Diffusion-Models

Unofficial implementation of Palette: Image-to-Image Diffusion Models by Pytorch

Language:PythonMIT149600

PyTorch-VAE

A Collection of Variational Autoencoders (VAE) in PyTorch.

Language:PythonApache-2.0649600

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Language:PythonNOASSERTION3560400

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonApache-2.0641800

DeepLip

deep-learning based audio-visual lip bometrics

Language:Python1400

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:Jupyter NotebookApache-2.0886000

VL-T5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

Language:PythonMIT35700

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.04686000

Awesome-Novel-Class-Discovery

A list of papers that studies Novel Class Discovery

42600

gans-in-action

Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

Language:Jupyter Notebook100800

muavic

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Language:PythonNOASSERTION35300

Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Language:PythonBSD-3-Clause2784000

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.

Language:PythonApache-2.0669600

VQFR

ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Language:PythonNOASSERTION32500

facexlib

FaceXlib aims at providing ready-to-use face-related functions based on current STOA open-source methods.

Language:PythonMIT82000

YouRefIt_ERU

Language:PythonMIT1900

AAAI22-one-shot-talking-face

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

Language:Python35200