failed to load 'stable_diffusion' version 1:

Question

failed to load 'stable_diffusion' version 1:

whatsondoc opened this issue 2 years ago · comments

Hello,

Following instructions to deploy this project, and observing that Triton is unable to load the stable_diffusion model.

This is seen in the Triton Server logs printed to stdout:

1028 08:21:03.012132 581 pb_stub.cc:309] Failed to initialize Python stub: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'

At:
  /models/stable_diffusion/1/model.py(58): initialize

I1028 08:21:03.465850 1 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: encoder (GPU device 1)
E1028 08:21:03.470367 1 model_lifecycle.cc:596] failed to load 'stable_diffusion' version 1: Internal: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'

At:
  /models/stable_diffusion/1/model.py(58): initialize

The specific function referenced in model.py is here (line 58, indicated below):

    def initialize(self, args: Dict[str, str]) -> None:
        """
        Initialize the tokenization process
        :param args: arguments from Triton config file
        """
        current_name: str = str(Path(args["model_repository"]).parent.absolute())
        self.device = "cpu" if args["model_instance_kind"] == "CPU" else "cuda"
        self.tokenizer = CLIPTokenizer.from_pretrained(current_name + "/stable_diffusion/1/")
        self.scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
        self.scheduler = self.scheduler.set_format("pt")   <--
        self.height = 512
        self.width = 512
        self.num_inference_steps = 50
        self.guidance_scale = 7.5
        self.eta = 0.0

I tried commenting this line out so self.scheduler is only defined in the line previous, and Triton Server starts and all models (including stable_diffusion) loads successfully and is reported by Triton as online and ready.

Leaving this in place, when subsequently working through the Jupyter Notebook, an error is raised (somewhat expectedly):

InferenceServerException: Failed to process the request(s) for model instance 'stable_diffusion', message: Stub process is not healthy.

So, forced back to the original issue - have you seen this before, or any idea on a fix?

lolagiscard · Answer 1 · Mon Nov 07 2022 23:28:38 GMT+0800 (China Standard Time)

Same issue on my side. Testing on a V100 using pip install --upgrade diffusers (0.7.2)
"failed to load 'stable_diffusion' version 1: Internal: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'"
Which version of Diffusers library are you using for this demo?

EDIT : with a previous version of diffusers (0.3.0) it's loading the model without error.
But then, still the other error
E1107 15:43:31.899870 116 python_be.cc:1818] Stub process is unhealthy and it will be restarted.

What do you think this comes from ?
Thanks for your help

Kamal Raj Kanakarajan · Answer 2 · Tue Nov 08 2022 20:24:46 GMT+0800 (China Standard Time)

Try now

aa52e4c

lolagiscard · Answer 3 · Tue Nov 08 2022 20:56:51 GMT+0800 (China Standard Time)

Unfortunately still not working when launching an Inference.
The server is however fixed with this new diffusers version.
E1108 12:53:16.125181 95 python_be.cc:1818] Stub process is unhealthy and it will be restarted. ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy.
What Hardware are you testing on ?
Thanks for the support.

whatsondoc · Answer 4 · Tue Nov 08 2022 21:03:24 GMT+0800 (China Standard Time)

I'm seeing the same, I'm afraid.

The Triton container build & server launch run smoothly, however when whizzing through the Inference notebook I get to stage #7, which produces the InferenceServerException: Failed to process the request(s) for model stable_diffusion, message: stub process is not healthy (which is also populated in the TritonServer log).

For reference, I'm trying this on a DGX-2 with 16 x V100-SXM3-32GB.

Kamal Raj Kanakarajan · Answer 5 · Tue Nov 08 2022 21:27:18 GMT+0800 (China Standard Time)

Hardware tested 1080Ti

Kamal Raj Kanakarajan · Answer 6 · Tue Nov 08 2022 21:27:56 GMT+0800 (China Standard Time)

@whatsondoc @lolagiscard
Could you please share screenshot/logs ?

lolagiscard · Answer 7 · Tue Nov 08 2022 21:44:51 GMT+0800 (China Standard Time)

Server Part (before inference)

Inference part :

Server after inference:

whatsondoc · Answer 8 · Tue Nov 08 2022 21:45:24 GMT+0800 (China Standard Time)

Sure, here's a few screenshots (let me know if you'd like the full logs, which would take a bit to get them out of the environment but it's doable).

The TritonServer logs were screenshotted after the Inference call was made.

Kamal Raj Kanakarajan · Answer 9 · Tue Nov 08 2022 22:09:00 GMT+0800 (China Standard Time)

Try running the docker with below cmd

docker run -it --rm --gpus device=0 -p8000:8000 -p8001:8001 -p8002:8002 --shm-size 16384m   \
-v $PWD/stable-diffusion-v1-4-onnx/models:/models tritonserver \
tritonserver --model-repository /models/

lolagiscard · Answer 10 · Tue Nov 08 2022 22:15:51 GMT+0800 (China Standard Time)

I'm already testing testing on 1 GPU only, if this is the change you want us to try.
There might be something wrong in the nvidia Triton docker itself, might not work with some of the GPU architectures

Kamal Raj Kanakarajan · Answer 11 · Tue Nov 08 2022 22:33:58 GMT+0800 (China Standard Time)

Okay.
I will test it on v100 and let you know.

@lolagiscard @lolagiscard running on v100 ?

lolagiscard · Answer 12 · Tue Nov 08 2022 22:37:47 GMT+0800 (China Standard Time)

Great , thanks let us know
Yes Im also on a V100 sorry should have said that already above :)

Kamal Raj Kanakarajan · Answer 13 · Wed Nov 09 2022 14:58:01 GMT+0800 (China Standard Time)

I was able to reproduce the issue with torch version 1.13

I have pinned torch 1.12.1 in docker and fixed the issue. 666e148

lolagiscard · Answer 14 · Wed Nov 09 2022 17:33:40 GMT+0800 (China Standard Time)

All good now, in deed !
Thanks a lot :)

whatsondoc · Answer 15 · Wed Nov 09 2022 17:46:28 GMT+0800 (China Standard Time)

Nice - thanks kamalkraj! Works like a charm.

One observation is that I needed to reduce this to run on a single GPU, when using more than one (originally I tried with 4) I get the following (screenshot attached).

As mentioned though, with a single GPU it works great, appreciate the support.

Kamal Raj Kanakarajan · Answer 16 · Wed Nov 09 2022 23:54:37 GMT+0800 (China Standard Time)

I will check the multi-gpu issue.

Please checkout v2

let me know of any issues

whozwhat · Answer 17 · Thu Dec 22 2022 03:22:22 GMT+0800 (China Standard Time)

I also encountered the same issue when I use multiple gpu and checked out the v3 branch，hope this screenshot helps in any way

Kamal Raj Kanakarajan · Answer 18 · Thu Dec 22 2022 03:26:28 GMT+0800 (China Standard Time)

@whozwhat
Multi-GPU not yet fixed
run the docker using below cmd should fix the issue for now

docker run -it --rm --gpus device=0 -p8000:8000 -p8001:8001 -p8002:8002 --shm-size 16384m   \
-v $PWD/models:/models sd_trt bash

whozwhat · Answer 19 · Mon Dec 26 2022 14:23:11 GMT+0800 (China Standard Time)

Thanks Reply, this cmd works.
It would be awesome if the multi-gpu issue could be fixed