AIGText / Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""

Home Page:https://glyph-byt5.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸš€πŸš€πŸš€ πŸ”₯πŸ”₯πŸ”₯ Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

This is the official implementation of Glyph-ByT5 and Glyph-ByT5-v2, introduced in Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering and Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering .

News

β›½β›½β›½ Contact: yuhui.yuan@microsoft.com

2024.06.28 We have removed the weights and code that may have used potentially unauthorized datasets in the current stage. We will update the checkpoints after the Microsoft RAI process.

πŸ”† Highlights

  • We identify two crucial requirements of text encoders for achieving accurate visual text rendering: character awareness and alignment with glyphs. To this end, we propose a customized text encoder, Glyph-ByT5, by fine-tuning the character-aware ByT5 encoder using a meticulously curated paired glyph-text dataset.

  • We present an effective method for integrating Glyph-ByT5 with SDXL, resulting in the creation of the Glyph-SDXL model for design image generation. This significantly enhances text rendering accuracy, improving it from less than 20% to nearly 90% on our design image benchmark. Noteworthy is Glyph-SDXL's newfound ability for text paragraph rendering, achieving high spelling accuracy for tens to hundreds of characters with automated multi-line layouts.

  • We deliver a powerful customized multilingual text encoder, Glyph-ByT5-v2, and a strong aesthetic graphic generation model, Glyph-SDXL-v2, that can support accurate spelling in $\sim10$ different languages

paragraph example 1 paragraph example 2 paragraph example 3 paragraph example 4
design example 1 design example 2 design example 3 design example 4
scene example 1 scene example 2 scene example 3 scene example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4
multilingual example 1 multilingual example 2 multilingual example 3 multilingual example 4

πŸ”§ Usage

For a detailed guide on Glyph-SDXL and Glyph-SDXL-v2 inference, see this folder.

πŸ“¬ Citation

If you find this code useful in your research, please consider citing:

@misc{liu2024glyphbyt5,
    title={Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering},
    author={Zeyu Liu and Weicong Liang and Zhanhao Liang and Chong Luo and Ji Li and Gao Huang and Yuhui Yuan},
    year={2024},
    eprint={2403.09622},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

and

@misc{liu2024glyphbyt5v2,
    title={Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering}, 
    author={Zeyu Liu and Weicong Liang and Yiming Zhao and Bohan Chen and Ji Li and Yuhui Yuan},
    year={2024},
    eprint={2406.10208},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

About

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""

https://glyph-byt5.github.io/

License:Apache License 2.0