rtrivedi-cubesmart / docker-diffusers-api

Diffusers / Stable Diffusion in docker with a REST API, supporting various models, pipelines & schedulers.

Home Page:https://kiri.art/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docker-diffusers-api ("banana-sd-base")

Diffusers / Stable Diffusion in docker with a REST API, supporting various models, pipelines & schedulers. Used by kiri.art, perfect for local, server & serverless.

Docker CircleCI semantic-release MIT License

Copyright (c) Gadi Cohen, 2022. MIT Licensed. Please give credit and link back to this repo if you use it in a public project.

Features

  • Models: stable-diffusion, waifu-diffusion, and easy to add others (e.g. jp-sd)

  • Pipelines: txt2img, img2img and inpainting in a single container

    ( all diffusers official and community pipelines are wrapped, but untested)

  • All model inputs supported, including setting nsfw filter per request

  • Permute base config to multiple forks based on yaml config with vars

  • Optionally send signed event logs / performance data to a REST endpoint

  • Can automatically download a checkpoint file and convert to diffusers.

  • S3 support, dreambooth training.

Note: This image was created for kiri.art. Everything is open source but there may be certain request / response assumptions. If anything is unclear, please open an issue.

Important Notices

Official help in our dedicated forum https://forums.kiri.art/c/docker-diffusers-api/16.

This README refers to the in-development dev branch and may reference features and fixes not yet in the published releases.

v1 has not yet been officially released yet but has been running well in production on kiri.art for almost a month. We'd be grateful for any feedback from early adopters to help make this official. For more details, see Upgrading from v0 to v1. Previous releases available on the dev-v0-final and main-v0-final branches.

Currently only NVIDIA / CUDA devices are supported. Tracking Apple / M1 support in issue #20.

Installation & Setup:

Setup varies depending on your use case.

  1. To run locally or on a server, with runtime downloads:

    docker run --gpus all -p 8000:8000 -e HF_AUTH_TOKEN=$HF_AUTH_TOKEN gadicc/diffusers-api.

    See the guides for various cloud providers.

  2. To run serverless, include the model at build time:

    1. docker-diffusers-api-build-download ( banana, others)
    2. docker-diffusers-api-runpod, see the guide
  3. Building from source.

    1. Fork / clone this repo.
    2. docker build -t gadicc/diffusers-api .
    3. See CONTRIBUTING.md for more helpful hints.

Other configurations are possible but these are the most common cases

Everything is set via docker build-args or environment variables.

Usage:

See also Testing below.

The container expects an HTTP POST request with the following JSON body:

{
  "modelInputs": {
    "prompt": "Super dog",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "width": 512,
    "height": 512,
    "seed": 3239022079
  },
  "callInputs": {
    // You can leave these out to use the default
    "MODEL_ID": "runwayml/stable-diffusion-v1-5",
    "PIPELINE": "StableDiffusionPipeline",
    "SCHEDULER": "LMSDiscreteScheduler",
    "safety_checker": true,
  },
}

Schedulers: docker-diffusers-api is simply a wrapper around diffusers, literally any scheduler included in diffusers will work out of the box, provided it can loaded with its default config and without requiring any other explicit arguments at init time. In any event, the following schedulers are the most common and most well tested: DPMSolverMultistepScheduler (fast! only needs 20 steps!), LMSDiscreteScheduler, DDIMScheduler, PNDMScheduler, EulerAncestralDiscreteScheduler, EulerDiscreteScheduler.

Pipelines:

Examples and testing

There are also very basic examples in test.py, which you can view and call python test.py if the container is already running on port 8000. You can also specify a specific test, change some options, and run against a deployed banana image:

$ python test.py
Usage: python3 test.py [--banana] [--xmfe=1/0] [--scheduler=SomeScheduler] [all / test1] [test2] [etc]

# Run against http://localhost:8000/ (Nvidia Quadro RTX 5000)
$ python test.py txt2img
Running test: txt2img
Request took 5.9s (init: 3.2s, inference: 5.9s)
Saved /home/dragon/www/banana/banana-sd-base/tests/output/txt2img.png

# Run against deployed banana image (Nvidia A100)
$ export BANANA_API_KEY=XXX
$ BANANA_MODEL_KEY=XXX python3 test.py --banana txt2img
Running test: txt2img
Request took 19.4s (init: 2.5s, inference: 3.5s)
Saved /home/dragon/www/banana/banana-sd-base/tests/output/txt2img.png

# Note that 2nd runs are much faster (ignore init, that isn't run again)
Request took 3.0s (init: 2.4s, inference: 2.1s)

The best example of course is https://kiri.art/ and it's source code.

Help on Official Forums.

Adding other Models

You have two options.

  1. For a diffusers model, simply set MODEL_ID build-var / call-arg to the name of the model hosted on HuggingFace, and it will be downloaded automatically at build time.

  2. For a non-diffusers model, simply set the CHECKPOINT_URL build-var / call-arg to the URL of a .ckpt file, which will be downloaded and converted to the diffusers format automatically at build time. CHECKPOINT_CONFIG_URL can also be set.

Troubleshooting

  • 403 Client Error: Forbidden for url

    Make sure you've accepted the license on the model card of the HuggingFace model specified in MODEL_ID, and that you correctly passed HF_AUTH_TOKEN to the container.

Event logs / performance data

Set CALL_URL and SIGN_KEY environment variables to send timing data on init and inference start and end data. You'll need to check the source code of here and sd-mui as the format is in flux.

This info is now logged regardless, and init() and inference() times are sent back via { $timings: { init: timeInMs, inference: timeInMs } }.

Acknowledgements

Originally based on https://github.com/bananaml/serverless-template-stable-diffusion.

About

Diffusers / Stable Diffusion in docker with a REST API, supporting various models, pipelines & schedulers.

https://kiri.art/

License:MIT License


Languages

Language:Python 88.8%Language:Shell 8.4%Language:Dockerfile 2.3%Language:JavaScript 0.5%