alvin zheng's repositories

EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

peclr

This is the pretraining code for PeCLR. An equivariant contrastive learning framework for 3D hand pose estimation. The paper is presented at ICCV 2021.

Stargazers:1Issues:0Issues:0

yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

3D-art-gallery

This is an interactive 3D art gallery made with Three.js, perfect for artists or designers to exhibit their portfolio of artworks and projects.

Stargazers:0Issues:0Issues:0

AI-generated-characters

AI-generated-character

Stargazers:0Issues:0Issues:0

chinese-poetry

The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。

License:MITStargazers:0Issues:0Issues:0

first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

License:MITStargazers:0Issues:0Issues:0

ICON

[CVPR'22] ICON: Implicit Clothed humans Obtained from Normals

License:NOASSERTIONStargazers:0Issues:0Issues:0

Imatch-P

A demo using SuperGlue and SuperPoint to do the image matching task based PaddlePaddle.

Stargazers:0Issues:0Issues:0

KAIR

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

License:MITStargazers:0Issues:0Issues:0

LightGlue

LightGlue: Local Feature Matching at Light Speed (ICCV 2023)

License:Apache-2.0Stargazers:0Issues:0Issues:0

magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

License:Apache-2.0Stargazers:0Issues:0Issues:0

python-qrcode

Python QR Code image generator

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

License:Apache-2.0Stargazers:0Issues:0Issues:0

SimSwap

An arbitrary face-swapping framework on images and videos with one single trained model!

License:NOASSERTIONStargazers:0Issues:0Issues:0

stylegan2

StyleGAN2 - Official TensorFlow Implementation

License:NOASSERTIONStargazers:0Issues:0Issues:0

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

License:MITStargazers:0Issues:0Issues:0

Swin-Transformer-Semantic-Segmentation

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

SwinTransformer

torch implementation of SwinTransformer

License:MITStargazers:0Issues:0Issues:0

Text2Video

ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".

Stargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

License:Apache-2.0Stargazers:0Issues:0Issues:0

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

License:MPL-2.0Stargazers:0Issues:0Issues:0

undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

License:GPL-3.0Stargazers:0Issues:0Issues:0

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Stargazers:0Issues:0Issues:0

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

License:Apache-2.0Stargazers:0Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

License:MITStargazers:0Issues:0Issues:0

yolov5_obb

yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测

License:GPL-3.0Stargazers:0Issues:0Issues:0