haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Home Page:https://llava.hliu.cc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question] Convert saved LLaVA checkpoint to SGLang

lukashelff opened this issue · comments

Question

Hey everyone has anyone figured out how we can transfer learned models to SGLang. I have already made some changes but wasn't able to make it work completely. By now i can successfully launch a server, however, i do receive CUDA error: device-side assert triggered.
Untill now i have made the following changes:

  1. update "model_type" in config.json file from "llava_lama" to "llava"
  2. update folder name to llava, otherwise is_multimodal flag is set incorrectly and the server does not launch.
  3. The preprocessor_config.json is not saved after running the llava training script (I used the one found in HF repo llava-hf/llava-1.5-13b-hf).
  4. I added the dictonary for the image token to the "added_tokens_decoder" tokenizer_config.json as it was missing.
  5. Also the processor_class was missing inside the tokenizer_config.json. i added "processor_class": "LlavaProcessor"

image
llava to llava_lama

hey thanks for the reply. Did you manage to use the model using SGLang? Unfortunately, changing only the model type did not help. I also did the changes above. Having looked more close int it i do receive multiple out of bound errors:
e.g.

../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [350,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.

I was able to load local v1.6 checkpoints as follows:

  1. Don't use the local tokenizer. Use the HG tokenizer provided by sglang.
  2. Update folder name to llava, otherwise is_multimodal flag is set incorrectly and the server does not launch.
  3. Update model config.json:
  • model_type: "llava_lama" -> "llava"
  • image_aspect_ratio: "pad" -> "anyres"
  • mm_patch_merge_type: "flat" -> "spatial_unpad"
  • _name_or_path: must contain the exact model_name for sglang to select the correct chat template e.g. "llava-v1.6-34b"
    For 1.5 checkpoints proceed with the steps 1 and 2 from above. In the config.json of the model adjust the model_type and remove entries mm_patch_merge_type and mm_projector_lr.

Thanks ,i try to do it