prompt-eng

notes for prompt engineering

Motivational Use Cases

images
- https://mpost.io/best-100-stable-diffusion-prompts-the-most-beautiful-ai-text-to-image-prompts/
video
- img2img of famous movie scenes (lalaland)
- virtual fashion (karenxcheng)
- evolution of scenes (xander)
- outpainting https://twitter.com/orbamsterdam/status/1568200010747068417?s=21&t=rliacnWOIjJMiS37s8qCCw
- webUI img2img collaboration https://twitter.com/_akhaliq/status/1563582621757898752
- image to video with rotation https://twitter.com/TomLikesRobots/status/1571096804539912192
- "prompt paint" https://twitter.com/1littlecoder/status/1572573152974372864
- music videos video, colab
- direct text2video project
text-to-3d https://twitter.com/_akhaliq/status/1575541930905243652
- https://dreamfusion3d.github.io/
gpt3 applications
- text to graphviz https://twitter.com/goodside/status/1561549768987496449?s=21&t=rliacnWOIjJMiS37s8qCCw
- suspedning to python for math

Tooling

Prompt Generator: https://huggingface.co/succinctly/text2image-prompt-generator
- This is a GPT-2 model fine-tuned on the succinctly/midjourney-prompts dataset, which contains 250k text prompts that users issued to the Midjourney text-to-image service over a month period. This prompt generator can be used to auto-complete prompts for any text-to-image model (including the DALL·E family)
Prompt Parrot https://colab.research.google.com/drive/1GtyVgVCwnDfRvfsHbeU0AlG-SgQn1p8e?usp=sharing
- This notebook is designed to train language model on a list of your prompts,generate prompts in your style, and synthesize wonderful surreal images! ✨
https://twitter.com/stuhlmueller/status/1575187860063285248
- The Interactive Composition Explorer (ICE), a Python library for writing and debugging compositional language model programs https://github.com/oughtinc/ice
- The Factored Cognition Primer, a tutorial that shows using examples how to write such programs https://primer.ought.org
Prompt Explorer
- https://twitter.com/fabianstelzer/status/1575088140234428416
- https://docs.google.com/spreadsheets/d/1oi0fwTNuJu5EYM2DIndyk0KeAY8tL6-Qd1BozFb9Zls/edit#gid=1567267935
Prompt generator https://www.aiprompt.io/
GUIs
Deforum Diffusion https://colab.research.google.com/github/deforum/stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb
Disco Diffusion https://news.ycombinator.com/item?id=32660138 "A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations."
Waifu Diffusion "A model trained on danbooru (anime/manga drawing site with also lewds and nsfw on it) over 56k images.Produces FAR BETTER results if you're interested in getting manga and anime stuff out of stable diffusion."
Edsynth and DAIN for coherence
FILM: Frame Interpolation for Large Motion (github)
Depth Mapping
- examples: https://twitter.com/TomLikesRobots/status/1566152352117161990
Art program plugins
Papers
- 2015: Deep Unsupervised Learning using Nonequilibrium Thermodynamics founding paper of diffusion models
- Textual Inversion: https://arxiv.org/abs/2208.01618
- 2017: Attention is all you need
- https://dreambooth.github.io/
  - productized as dreambooth https://twitter.com/psuraj28/status/1575123562435956740
  - https://github.com/JoePenna/Dreambooth-Stable-Diffusion
- very good BLOOM model overview

Live updated list: https://www.reddit.com/r/StableDiffusion/comments/wqaizj/list_of_stable_diffusion_systems/

Communities

StableDiffusion Discord https://discord.com/invite/stablediffusion
Deforum Discord https://discord.gg/upmXXsrwZc
Lexica Discord https://discord.com/invite/bMHBjJ9wRh
Midjourney
https://promptbase.com/ Selling prompts that produce desirable results
Prompt Galleries
- https://arthub.ai/
- 🌟 https://lexica.art/
- https://pagebrain.ai/promptsearch/
- https://avyn.com/
- https://dallery.gallery/
- The Ai Art: gallery for modifiers.
- urania.ai: Top 500 Artists gallery, sorted by image count. With modifiers/styles.
- Generrated: DALL•E 2 table gallery sorted by visual arts media.
- Artist Studies by @remi_durant: gallery and Search.
- CLIP Ranked Artists: gallery sorted by weight/strength.
- https://publicprompts.art/ very basic/limited but some good prompts

Stable Diffusion

stable diffusion specific notes

Main: https://github.com/CompVis/stable-diffusion

Required reading:

param intuitionhttps://www.reddit.com/r/StableDiffusion/comments/x41n87/how_to_get_images_that_dont_suck_a/
CLI commands https://www.assemblyai.com/blog/how-to-run-stable-diffusion-locally-to-generate-images/#script-options

Distros

Bundled Distros
Web Distros
- https://www.mage.space/
- https://dreamlike.art/ has img2img
- https://inpainter.vercel.app/paint for inpainting
- https://promptart.labml.ai/feed
- https://www.strmr.com/ dreambooth tuning for $3
Twitter Bots
- https://twitter.com/diffusionbot
- https://twitter.com/m1guelpf/status/1569487042345861121
Windows "retard guides"
- https://rentry.org/voldy
- https://rentry.org/GUItard

more: https://np.reddit.com/r/StableDiffusion/comments/xcrm4d/useful_prompt_engineering_tools_and_resources/

Prompt galleries and search engines:

Lexica: Content-based search powered by OpenAI's CLIP model. Seed, CFG, Dimensions.
OpenArt: Content-based search powered by OpenAI's CLIP model. Favorites.
PromptHero: Random wall. Seed, CFG, Dimensions, Steps. Favorites.
Libraire: Seed, CFG, Dimensions, Steps.
Krea: modifiers focused UI. Favorites. Gives prompt suggestions and allows to create prompts over Stable diffusion, Waifu Diffusion and Disco diffusion. Really quick and useful
Avyn: Search engine and generator.
Pinegraph: discover, create and edit with Stable/Disco/Waifu diffusion models.
Phraser: text and image search.

Visual search:

Lexica: enter an image URL in the search bar. Or next to q=. Example
Phraser: image icon at the right.
same.energy
Yandex, Bing, Google, Tineye, iqdb: reverse and similar image search engines.
Pinterest
dessant/search-by-image: Open-source browser extension for reverse image search.

Prompt generators:

promptoMANIA: Visual modifiers. Great selection. With weight setting.
Phase.art: Visual modifiers. SD Generator and share.
Phraser: Visual modifiers.
AI Text Prompt Generator
Dynamic Prompt generator
succinctly/text2image: GPT-2 Midjourney trained text completion.
Prompt Parrot colab: Train and generate prompts.
cmdr2: 1-click SD installation with image modifiers selection.

Img2prompt:

img2prompt Replicate by methexis-inc: Optimized for SD (clip ViT-L/14).
CLIP Interrogator by @pharmapsychotic: select ViTL14 CLIP model.
CLIP Artist Evaluator colab
BLIP

Explore Artists, styles, and modifiers:

Artist Style Studies & Modifier Studies by parrot zone: Gallery, Style, Spreadsheet
Clip retrieval: search laion-5b dataset.
Datasette: image search; image-count sort by artist, celebrities, characters, domain
Visual arts: media list, related; Artists list by genre, medium; Portal

Guides and studies:

Prompt Tools directories and guides:

Other SD directories:

SD Major forks

https://www.reddit.com/r/StableDiffusion/comments/wqaizj/list_of_stable_diffusion_systems/

Forks

https://github.com/lkwq007/stablediffusion-infinity Outpainting with Stable Diffusion on an infinite canvas.
https://github.com/basujindal/stable-diffusion This repo is a modified version of the Stable Diffusion repo, optimized to use less VRAM than the original by sacrificing inference speed.
https://github.com/hlky/stable-diffusion (here is another fork that might be better)
- adds a bunch of features - GUI/webui, textual inversion, upscalers, mask and crop, img2img editor, word seeds, prompt weighting
  - doesn't work on Mac Sygil-Dev/stable-diffusion#173
- How to Fine-tune Stable Diffusion using Textual Inversion https://towardsdatascience.com/how-to-fine-tune-stable-diffusion-using-textual-inversion-b995d7ecc095
- https://github.com/AbdBarho/stable-diffusion-webui-docker
  - Run Stable Diffusion on your machine with a nice UI without any hassle! This repository provides the WebUI as a docker image for easy setup and deployment. Please note that the WebUI is experimental and evolving quickly, so expect some bugs.
  - doesnt work on m1 mac yet AbdBarho/stable-diffusion-webui-docker#31
https://github.com/invoke-ai/InvokeAI (previously https://github.com/lstein/stable-diffusion) and https://github.com/magnusviri/stable-diffusion
- An interactive command-line interface that accepts the same prompt and switches as the Discord bot.
- A basic Web interface that allows you to run a local web server for generating images in your browser.
- A notebook for running the code on Google Colab.
- Support for img2img in which you provide a seed image to guide the image creation. (inpainting & masking coming soon)
- Upscaling and face fixing using the optional ESRGAN (standalone: https://news.ycombinator.com/item?id=32628761) and GFPGAN packages.
- Weighted subprompts for prompt tuning.
- Textual inversion for customization of the prompt language and images.
- fuller feature list https://www.reddit.com/r/StableDiffusion/comments/xcclmf/comment/io6u03s/?utm_source=reddit&utm_medium=web2x&context=3
https://github.com/bfirsh/stable-diffusion
- works on M1 Macs - blog, tweet
- can also look at environment-mac.yaml from https://github.com/fragmede/stable-diffusion/blob/mps_consistent_seed/environment-mac.yaml
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Installation-on-Apple-Silicon
- another m1 mac compatible fork - only 2 samplers, Euler and DPM2, with real-ESRGAN upscaling
- https://colab.research.google.com/drive/1kw3egmSn-KgWsikYvOMjJkVDsPLjEMzl
https://github.com/harubaru/waifu-diffusion
- nicer GUI for img2img
fast-stable-diffusion colabs, +25% speed increase + memory efficient. https://github.com/TheLastBen/fast-stable-diffusion (be careful on gdrive security)

SD Tooling

SD's DreamStudio https://beta.dreamstudio.ai/dream
Stable Worlds: colab for 3d stitched worlds via StableDiffusion https://twitter.com/NaxAlpha/status/1578685845099290624
Midjourney + SD: https://twitter.com/EMostaque/status/1561917541743841280
Nightcafe Studio
misc
- (super super raw dont try yet) https://github.com/breadthe/sd-buddy

Other languages

SD Model values

SD Results

Img2Img

A black and white photo of a young woman, studio lighting, realistic, Ilford HP5 400
- https://twitter.com/TomLikesRobots/status/1566027217892671488

Hardware requirements

https://news.ycombinator.com/item?id=32642255#32646761
- For something like this, you ideally would want a powerful GPU with 12-24gb VRAM.
- A $500 RTX 3070 with 8GB of VRAM can generate 512x512 images with 50 steps in 7 seconds.

SD vs DallE vs MJ

DallE banned so SD https://twitter.com/almost_digital/status/1556216820788609025?s=20&t=GCU5prherJvKebRrv9urdw

Misc

Whisper
- https://huggingface.co/spaces/sensahin/YouWhisper YouWhisper converts Youtube videos to text using openai/whisper.
- https://twitter.com/jeffistyping/status/1573145140205846528 youtube whipserer
- multilingual subtitles https://twitter.com/1littlecoder/status/1573030143848722433
- video subtitles https://twitter.com/m1guelpf/status/1574929980207034375
- you can join whisper to stable diffusion for reasons https://twitter.com/fffiloni/status/1573733520765247488/photo/1
- known problems https://twitter.com/lunixbochs/status/1574848899897884672 (edge case with catastrophic failures)
textually guided audio https://twitter.com/FelixKreuk/status/1575846953333579776
Codegen
- CodegeeX https://twitter.com/thukeg/status/1572218413694726144
- https://github.com/salesforce/CodeGen https://joel.tools/codegen/
pdf to structured data https://www.impira.com/blog/hey-machine-whats-my-invoice-total
text to Human Motion diffusion https://twitter.com/GuyTvt/status/1577947409551851520
- abs: https://arxiv.org/abs/2209.14916
- project page: https://guytevet.github.io/mdm-page/

About