Bin Zhu's starred repositories
paper-reading
深度学习经典、新论文逐段精读
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
AnimateDiff
Official implementation of AnimateDiff.
PhotoMaker
PhotoMaker [CVPR 2024]
VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
lang-segment-anything
SAM with text prompt
DiffSynth-Studio
Enjoy the magic of Diffusion models!
Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
Machine-Mindset
An MBTI Exploration of Large Language Models
Mini-DALLE3
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
repaint123
Official implementation of Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting (ECCV 2024)
Progressive3D
Official implementation of "Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts" [ICLR 2024]
Envision3D
Envision3D: One Image to 3D with Anchor Views Interpolation
web_gpt-on-wechat
有chatgpt账户即可白嫖使用微信机器人,无需支付api费用;且通过自定义提示词很方便的为微信机器人设置好角色属性、定位。"With a ChatGPT account, you can easily use the WeChat bot for free without paying API fees; and it's convenient to set up role attributes and positioning for the WeChat bot through custom prompt words."
fid-metrics
A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.