style_scale scale preservation

Question

style_scale scale preservation

moofin2017 opened this issue 7 years ago · comments

moofin2017 commented 7 years ago

To preserve the initial image style at a higher resolution, and assuming an initial styleScale=1, then for a 2x2 script (and assuming we do single-scale at the same resolution to get roughly a 2x higher resolution output) we would want to set the neuralTile styleScale=2. This would require roughly 2x2=4 times as much GPU RAM, which defeats the purpose of doing tiling to increase the res. How else could we use Neural-Tile to increase the resolution while preserving the initial image style?

ProGamerGov · Answer 1 · Sat Jun 10 2017 08:15:11 GMT+0800 (China Standard Time)

@moofin2017 At the moment, there isn't really a way to use the style scale trick to create larger images, so you have to make due with adding in more detail.

You can try to use only a few hundred iterations, along with playing with parameters that don't result in a lot of visible change. I've had some good success using Neural-Tile with this strategy and the NIN model.

On the subject of -style_scale according to the creator of Neural-Style, you could modify the way that the -style_scale parameter works in order to save resources which would make the -style_scale manipulation technique more viable.

Yes, this is expected. The -style_scale flag sets the size of the style image, and in order to compute the Gram matrix style targets we need to run the (now high-resolution) style image forward through the network.
This means that (roughly speaking) -style_scale X -image_size Y will require about the same amount of memory as -style_scale 1.0 -image_size X * Y.
As an interesting side note, computing the Gram style targets only requires a forward pass through the network; in principle this could require very low memory usage if you overwrite the activations of each layer after they are no longer needed. Unfortunately this sort of advanced memory management is not easy to do in Torch; however other frameworks like PyTorch would make this easy.

Source: https://www.reddit.com/r/deepdream/comments/5tcp7z/has_anyone_else_experienced_high_memory/ddplopz/

moofin2017 · Answer 2 · Sun Jun 11 2017 00:52:02 GMT+0800 (China Standard Time)

Thank you for your prompt response. I will give NIN a spin. Was https://github.com/crowsonkb useful? It seems to use less memory and can create arbitrary sizes with better tiled color matching but the created styles are not as good or the same style as neural-style.

ProGamerGov · Answer 3 · Mon Jun 12 2017 04:48:03 GMT+0800 (China Standard Time)

@moofin2017 For a while, I found that with style_transfer, you could simply convert parameters from the Neural-Style format to the style_transfer format: crowsonkb/style_transfer#8

Though that might have changed since then. This probably answers your issue here.

One of the issues I found with style_transfer was that it became hard to predict how long it would take to create images, and my lack of experimentation as to controlling the changes caused by the author's tiling solution. Sadly it seems @crowsonkb got bored of the project, and hasn't responded to any issues in a long time. It would be nice to experiment with interesting new ideas like DeepDream channel manipulation: https://github.com/ProGamerGov/Protobuf-Dreamer, seeing as style_transfer supports everything from obscure ResNet models, to GoogleNet and VGG models.