JiahaoTian (JiahaoTian-sjtu)

JiahaoTian-sjtu

Geek Repo

Location:Shanghai

Github PK Tool:Github PK Tool

JiahaoTian's starred repositories

flux

Official inference repo for FLUX.1 models

Language:PythonLicense:Apache-2.0Stargazers:15661Issues:0Issues:0

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Language:PythonLicense:Apache-2.0Stargazers:44010Issues:0Issues:0

GlyphDraw2

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Language:PythonLicense:MITStargazers:46Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:47486Issues:0Issues:0

OCR-SAM

Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting

Language:PythonStargazers:531Issues:0Issues:0

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:Apache-2.0Stargazers:2782Issues:0Issues:0

scepter

SCEPTER is an open-source framework used for training, fine-tuning, and inference with generative models.

Language:PythonLicense:Apache-2.0Stargazers:419Issues:0Issues:0

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7043Issues:0Issues:0

Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:504Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:134501Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:15000Issues:0Issues:0

Diffusion-Tryon-Trainer

Diffusion-Tryon-Trainer

Language:PythonLicense:NOASSERTIONStargazers:122Issues:0Issues:0

VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:4214Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:12510Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:390Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:22137Issues:0Issues:0

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language:PythonLicense:MITStargazers:4499Issues:0Issues:0

AnimateAnyone-reproduction

reproduction of AnimateAnyone

Language:PythonStargazers:166Issues:0Issues:0

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonLicense:Apache-2.0Stargazers:6982Issues:0Issues:0

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonLicense:NOASSERTIONStargazers:698Issues:0Issues:0

Open-AnimateAnyone

Unofficial Implementation of Animate Anyone

Language:PythonStargazers:2932Issues:0Issues:0

VisorGPT

[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT

Language:PythonLicense:MITStargazers:131Issues:0Issues:0

LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD, TMLR 2024)

Language:PythonStargazers:431Issues:0Issues:0

SLD

🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)

Language:PythonLicense:MITStargazers:154Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:1107Issues:0Issues:0

DS-Fusion

Code for project DS-Fusion

Language:PythonLicense:MITStargazers:146Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:20109Issues:0Issues:0

MIGC

[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)

Language:PythonLicense:NOASSERTIONStargazers:535Issues:0Issues:0

InstanceDiffusion

[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"

Language:PythonLicense:Apache-2.0Stargazers:502Issues:0Issues:0

GLIGEN

Open-Set Grounded Text-to-Image Generation

Language:PythonLicense:MITStargazers:2006Issues:0Issues:0