LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MAR for Image-to-Image Generation

Bili-Sakura opened this issue · comments

I am a green-hand of MAR, as I only explore LDMs in image generation currently. I wonder whether MAR can be used for Image-to-Image generation especially for image editing (e.g. temporal generation). Also, I wonder whether MAR (with diff loss) trained with large-scale dataset is able to be fine-tuned with a domain specific dataset for temporal generation.

Again, awesome work!

Thanks for your interest! Similar to MaskGIT and MAGE, MAR can naturally perform mask-based image editing tasks such as image inpainting, outpainting, and uncropping. If the temporal generation you mentioned is for video, then it would be hard to fine-tune a model that is trained to generate image to generate a video.

Thank you!
I am working with satellite image generation where each image captures the same region from different timestamp ( and this is what I meant to "temporal" and I am not sure whether this task can be categorized into inpainting or not).
It is witnessed that temporal layers used in LDMs for video synthesis can be incorporated into ControlNet so that LDMs are able to generate temporal images. I wonder if there are similar techniques for MAR.

The MAR framework in this repo is designed to generate images and cannot be directly used to generate videos. However, it should be easy to adapt the code a bit for the task you described, similar to LDM for video synthesis.