is it possible to provide a single frame input and get style transferred to video?

Question

is it possible to provide a single frame input and get style transferred to video?

GeorvityLabs opened this issue 2 years ago · comments

currently ,there are few models seen : pixar , cartoon etc.

but , is it possible for us to upload our own style input image (single image) and do a style transfer onto a video?

Is it possible to do that with VToonify , if so , could someone make a colab notebook , where users can input a single stylised frame and be able to transfer it to a target video?

This would allow fully customisable style transfer like EbSynth.

Shuai Yang · Answer 1 · Sat Sep 24 2022 15:59:53 GMT+0800 (China Standard Time)

Thank you for your interest.
I think it is possible to use one style image to train VToonify during training.
But if so, the users input a single frame, and need to fully train the models, after that they can get the results.

Our work is to extend stylegan-based image toonification model to videos.
So we need a pre-trained stylegan-based image toonification model like Toonify or DualStyleGAN.
These stylegan-based image toonification models are usually trained on 100~300 style images.
But I have also read two papers training the model with only 1~10 style images: JoJoGAN and FS-Ada
Maybe you can refer to these two papers to train your own stylegan with one style image, based on which VToonify is then trained.

If you refer to one-shot style transfer in the testing phase without training, you may refer to the image animation methods like FOM and DaGAN

GeorvityLabs · Answer 2 · Sat Sep 24 2022 16:02:10 GMT+0800 (China Standard Time)

thanks a lot for the resources @williamyang1991 , will def have a look at them.
It would be amazing if there was a notebook with single image to video style transfer (maybe a training notebook and an inference notebook)

will def have a closer look at the links you listed!