alvin zheng's repositories
3D-art-gallery
This is an interactive 3D art gallery made with Three.js, perfect for artists or designers to exhibit their portfolio of artworks and projects.
AI-generated-characters
AI-generated-character
chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
first-order-model
This repository contains the source code for the paper First Order Motion Model for Image Animation
ICON
[CVPR'22] ICON: Implicit Clothed humans Obtained from Normals
Imatch-P
A demo using SuperGlue and SuperPoint to do the image matching task based PaddlePaddle.
KAIR
Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR
LightGlue
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
python-qrcode
Python QR Code image generator
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
SimSwap
An arbitrary face-swapping framework on images and videos with one single trained model!
stylegan2
StyleGAN2 - Official TensorFlow Implementation
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Swin-Transformer-Semantic-Segmentation
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.
SwinTransformer
torch implementation of SwinTransformer
Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
undetected-chromedriver
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
VGen
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
yolov5_obb
yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测