[Question] Convert saved LLaVA checkpoint to SGLang

Question

[Question] Convert saved LLaVA checkpoint to SGLang

lukashelff opened this issue a month ago · comments

Question

Hey everyone has anyone figured out how we can transfer learned models to SGLang. I have already made some changes but wasn't able to make it work completely. By now i can successfully launch a server, however, i do receive CUDA error: device-side assert triggered.
Untill now i have made the following changes:

update "model_type" in config.json file from "llava_lama" to "llava"
update folder name to llava, otherwise is_multimodal flag is set incorrectly and the server does not launch.
The preprocessor_config.json is not saved after running the llava training script (I used the one found in HF repo llava-hf/llava-1.5-13b-hf).
I added the dictonary for the image token to the "added_tokens_decoder" tokenizer_config.json as it was missing.
Also the processor_class was missing inside the tokenizer_config.json. i added "processor_class": "LlavaProcessor"

乔@康 · Answer 1 · Mon Apr 29 2024 17:59:35 GMT+0800 (China Standard Time)

llava to llava_lama

L Helff · Answer 2 · Mon Apr 29 2024 18:05:51 GMT+0800 (China Standard Time)

hey thanks for the reply. Did you manage to use the model using SGLang? Unfortunately, changing only the model type did not help. I also did the changes above. Having looked more close int it i do receive multiple out of bound errors:
e.g.

../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [350,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.

L Helff · Answer 3 · Tue Apr 30 2024 23:20:59 GMT+0800 (China Standard Time)

I was able to load local v1.6 checkpoints as follows:

Don't use the local tokenizer. Use the HG tokenizer provided by sglang.
Update folder name to llava, otherwise is_multimodal flag is set incorrectly and the server does not launch.
Update model config.json:

model_type: "llava_lama" -> "llava"
image_aspect_ratio: "pad" -> "anyres"
mm_patch_merge_type: "flat" -> "spatial_unpad"
_name_or_path: must contain the exact model_name for sglang to select the correct chat template e.g. "llava-v1.6-34b"
For 1.5 checkpoints proceed with the steps 1 and 2 from above. In the config.json of the model adjust the model_type and remove entries mm_patch_merge_type and mm_projector_lr.

乔@康 · Answer 4 · Wed May 08 2024 09:29:31 GMT+0800 (China Standard Time)

Thanks ,i try to do it