There are 0 repository under text-image topic.
The editor for ASCII-graphics, combining a graphical editor and an image to text converter. Decorate your text and surprise your readers with an original social media post or blog post using ASCII graphics. The tool does not require an internet connection and can work offline in a browser.
A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten generation, scene text recognition and scene text detection.
๐๐๐ Text image can "textify" text, images, and videos, and can be used with simple configuration
[BMVC 2023] Zero-shot Composed Text-Image Retrieval
This repository features three demos that can be effortlessly integrated into your AWS environment. They serve as a practical guide to leveraging AWS services for crafting a sophisticated Large Language Model (LLM) Generative AI, geared towards creating a responsive Question and Answer Bot and localizing content generation.
Official implementation of the work "Text-Driven Image Editing via Learnable Regions" (CVPR 2024)
A fine tune version of Stable Diffusion model on self-translate 10k diffusiondb Chinese Corpus and "extend" it
ๅพฎไฟกๅฐ็จๅบ็ๅพๆ็ผ่พๅ่ฝ๏ผๅฏ้ๅฏนๅไธช่พๅ ฅๆก็ๆๅญ่ฟ่ก็ฎๅๆ ทๅผ่ฐๆด๏ผๅจๆๅญไธญ้ดๆๅ ฅใๅ ้คๅพ็๏ผ
iOS ๅฏๆๆฌ็ผ่พ๏ผๅ็ๅพๆๆททๆ ๅพๆๅนถ่ NSAttributedString่ฝฌhtml html่ฝฌNSAttributedString base64ๅพ็ไธไผ ,Rich Text Editor
Line and Word Segmentation for Bangla Handwritten Text Recognition
A Light Neural Network To Control Stable Diffusion Spatial Information tuned by Chinese
This repository is based on the work done for the Bangla Handwritten Line Segmentation
Paging menu controller having text and imageview in the Tab
lmmtoolkit is a toolkit for Multi-Modal Learning
Software tool that compresses text binary images (lossless compression) to less than 0.002% of their original size on average.
Replication Code for: Making Text-Image Connection Formal and Practical
Text-Image-Text is a bidirectional system that enables seamless retrieval of images based on text descriptions, and vice versa. It leverages state-of-the-art language and vision models to bridge the gap between textual and visual representations.
11000-Image-Video-caption-data-of-human-action
20011--Image-Caption-Data-Of-OCR-In-Natural-Scenes
To Fuse Semantic and Positional Clues with Cross-Attention for Scene Text Recognition
A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".
To track the latest paper for embedding (including text/text-code/text-image embeddings)