You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.

Question

You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.

Get-David opened this issue 2 months ago · comments

我下载了liuhaotian/llava-v1.6-34b,
然后将caption_llava.py里对应路径改成本地路径,然后执行一下命令

(cuda118) zdw@ai-gpu-server149:~/cuda118/Open-Sora$ CUDA_VISIBLE_DEVICES=0,1 python -m tools.caption.caption_llava /home/zdw/video_datasets/clips output.csv
Dataset contains 26 videos.
Processing 26 new videos.
Prompt: A video is given by providing three frames in chronological order. Describe this video and its style to generate a description. Pay attention to all objects in the video. Do not describe each frame individually. Do not reply with words like 'first frame'. The description should be useful for AI to re-generate the video. The description should be less than six sentences. Here are some examples of good descriptions: 1. A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about. 2. Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field. 3. Drone view of waves crashing against the rugged cliffs along Big Sur's garay point beach. The crashing blue waters create white-tipped waves, while the golden light of the setting sun illuminates the rocky shore. A small island with a lighthouse sits in the distance, and green shrubbery covers the cliff's edge. The steep drop from the road down to the beach is a dramatic feat, with the cliff’s edges jutting out over the sea. This is a view that captures the raw beauty of the coast and the rugged landscape of the Pacific Coast Highway.
You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.
preprocessor_config.json: 100%|███████████████| 316/316 [00:00<00:00, 831kB/s]
config.json: 100%|███████████████████████| 4.76k/4.76k [00:00<00:00, 11.0MB/s]
pytorch_model.bin:   5%|▉                 
pytorch_model.bin:  50%|█████████▍         | 849M/1.71G [23:16<24:02, 598kB/s]

为什么又开始下载了？

GetDavid commented 2 months ago

@JThh

Jiatong (Julius) Han · Answer 1 · Wed Apr 10 2024 21:30:17 GMT+0800 (China Standard Time)

May I know the contents of the local model path?

GetDavid · Answer 2 · Thu Apr 11 2024 19:30:28 GMT+0800 (China Standard Time)

May I know the contents of the local model path?

sure，is/home/zdw/Open-Sora/pre_training/llava-v1.6-34b
I waited for a long time, but when the download was completed, it worked normally. What was it downloading?

Jiatong (Julius) Han · Answer 3 · Thu Apr 11 2024 21:44:39 GMT+0800 (China Standard Time)

Can you please print the contents of /home/zdw/Open-Sora/pre_training/llava-v1.6-34b; it should hold the .bin file that contains model weights. Otherwise, HF will download to some default caching space as set with env variable HF_HOME that is normally at ~/cache/huggingface/hub/....

github-actions · Answer 4 · Fri Apr 19 2024 09:45:25 GMT+0800 (China Standard Time)

This issue is stale because it has been open for 7 days with no activity.