shaohua.zhang's repositories
AI_Tutorial
精华机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理
AlphX-Code-For-DAR
粤港澳大湾区(黄埔)国际算法算例大赛-古籍文档图像识别与分析算法比赛 Alphx队源码
awesome-digital-human
A collection of resources on digital human including clothed people digitalization, virtual try-on, and other related directions.
cat-catch
猫抓 chrome资源嗅探扩展
CenterNet
Object detection, 3D detection, and pose estimation using center point detection:
char-detection
🔥Char detection base on crnn 字符(单字)检测基于CRNN
Code-LMs
Guide to using pre-trained large language models of source code
CVprojects
computer vision projects | 计算机视觉相关好玩的AI项目(Python、C++)
danbooru-diffusion-prompt-builder
Danbooru / NovelAI 标签超市
DAVAR-Lab-OCR
The implementations of some works from Davar-Lab. Currently we have the code of Text Perceptron (AAAI 2020). Some works' code will be published soon, including YORO (ACMMM 2019) , TRIE (ACMMM2020), FREE(TIP 2020), SPIN (AAAI 2021), MANGO (AAAI2021), etc.
DeepFaceLive
Real-time face swap for PC streaming or video calls
gpt-researcher
GPT based autonomous agent that does online comprehensive research on any given topic
inst-inpaint
A novel inpainting framework that can remove objects from images based on the instructions given as text prompts.
llm_babyCare
育儿宝典
OMML
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Pix2Text
Pix In, Latex & Text Out. Recognize Chinese, English Texts, and Math Formulas from Images.
RingRWKV
修复Transformer官方库中RWKV的适配问题,支持RWKV所有系列模型在转换后,通过RingRWKV库,与其他transfomer模型一样简单方便地部署和微调。
SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
sd-webui-EasyPhoto
📷 EasyPhoto | Your Smart AI Photo Generator.
TableGeneration
通过浏览器渲染生成表格图像
Token-Path-Prediction
This is an unofficial re-implementation to the EMNLP 2023 paper: Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction.
Topic-on-Table-Recognition
This is a survey on the topic of table recognition
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
WenxinWorkshop-Python-SDK
一个文心千帆平台的第三方 Python SDK。A third-party Python SDK for a WenxinWorkshop.