DeepSVG for text-conditioned vector generation.
nd7141 opened this issue · comments
I wonder if it's possible to adapt DeepSVG to replace the VAE block in stable diffusion to generate vector graphics?
I see a couple of problems.
- The latent embedding size in DeepSVG (256) does not match latent embedding size of SD (64).
- diffusers library expects bin file instead of pth. There is a script to convert it to diffusers but it seems to use
AutoencoderKL
, which I'm not sure the right architecture.
I wonder if you know an easy way to adopt DeepSVG for diffusers library?
That's a great idea. Although I wonder if training a text based LM over the SVG source code dataset would be a better way to go about this, I don't know.
Edit: I managed to find a project called VectorFusion which generates SVG from text description using the diffusion model. The authors have a paper on arXiv but they have not published their code unfortunately. The main author has an old github repository which does something similar, but I haven't tried it yet.