CLIP is all you need (c) crumb

Greatly inspired by this tweet and all the CLIP guided approaches.

What if we directly optimize raw image tensor using CLIP, instead of tuning a generator network or its inputs? Just like style transfer algos were doing 5 years ago :D

This is a place for all my hacks on this topic. Feel free to open this notebook in colab and mess around :D