LLaMA3 or LLaMA3-Instruct

Question

LLaMA3 or LLaMA3-Instruct

berry-ding opened this issue 3 months ago · comments

Wenchao Ding commented 3 months ago

Great work! I want to know if your pre-training used LLaMA 3 or LLaMA 3-Instruct.

BAAI-DCAI · Answer 1 · Mon Apr 22 2024 13:09:44 GMT+0800 (China Standard Time)

LLaMA 3 Base Model. Thanks!

Daniel Ji · Answer 2 · Mon Apr 22 2024 21:16:58 GMT+0800 (China Standard Time)

Hey, @BAAI-DCAI team,

Any experience to share what is the difference of these two versions? Why choose llama 3 base model but not instruct-tuned model?

Isaachhh · Answer 3 · Fri Apr 26 2024 00:53:13 GMT+0800 (China Standard Time)

Our primary experiments are based on Llama-3-8B. And we then find that using instruct-tuned model would be better. Now we have updated the weights based on Llama-3-8B-Instruct.

Wenchao Ding · Answer 4 · Fri Apr 26 2024 09:55:36 GMT+0800 (China Standard Time)

Our primary experiments are based on Llama-3-8B. And we then find that using instruct-tuned model would be better. Now we have updated the weights based on Llama-3-8B-Instruct.

Great news! Looking forward to your release of the fine-tuning strategies.

Daniel Ji · Answer 5 · Wed May 01 2024 14:10:06 GMT+0800 (China Standard Time)

@Isaachhh

Do you have plans to go further with Phi-3?

Isaachhh · Answer 6 · Wed May 01 2024 17:21:22 GMT+0800 (China Standard Time)

@GewelsJI

Please refer to https://huggingface.co/BAAI/Bunny-v1_0-4B.

The GitHub would be updated soon and we are still working on improving the performance of Bunny-Llama-3-8B-V and Bunny-v1.0-4B. Stay tuned!

Daniel Ji · Answer 7 · Thu May 02 2024 08:47:30 GMT+0800 (China Standard Time)

That's awesome. Keep attention on your updates. Thanks.

Daniel Ji · Answer 8 · Thu May 02 2024 13:09:21 GMT+0800 (China Standard Time)

@Isaachhh

Further question is: do you guys wanna play with Gemma models in your codebase?

Isaachhh · Answer 9 · Sat May 04 2024 16:17:35 GMT+0800 (China Standard Time)

@GewelsJI
Hi, we conducted some experiments about Bunny-Gemma on mid March and I uploaded the related codes into gemma_temp branch. Note that the version and conv_mode should be gemma.

But we can't guarantee that it works well now. And we may not release the model weights recently.

Hope this can help you. Feel free to comment if you have further questions.

Guillaume Alleon · Answer 10 · Tue May 07 2024 04:16:20 GMT+0800 (China Standard Time)

@Isaachhh
I would like to fine tune Bunny-Llama-3-8B-V on some of my data. Can I use the existing train.py file or should I wait for better VIT strategy you mentioned in the README.md

thanks for your work

MagicSource · Answer 11 · Wed May 08 2024 14:29:18 GMT+0800 (China Standard Time)

They opened vit in both pretrain and sft actually but didn't opensource the recipe.

Isaachhh · Answer 12 · Wed May 08 2024 14:37:42 GMT+0800 (China Standard Time)

They opened vit in both pretrain and sft actually but didn't opensource the recipe.

The strategy only differs in the visual instruction tuning stage. And the vision tower was frozen under pre-training stage.

Guillaume Alleon · Answer 13 · Wed May 08 2024 14:56:46 GMT+0800 (China Standard Time)

I [think] was able to finetune the adapter starting from phi2 pretrain weights. Any plan to release those weights for phi3 and llama3 ?

Isaachhh · Answer 14 · Wed May 08 2024 20:50:20 GMT+0800 (China Standard Time)

@galleon

We have released that.

Guillaume Alleon · Answer 15 · Wed May 08 2024 21:55:20 GMT+0800 (China Standard Time)

@IsaachhhI was not able to find them on 🤗. I am talking abt this BAAI/bunny-pretrain-phi-2-siglip but for phi-3 or llama3 … May be it can be extracted from the full model ? How ?Sent from a small deviceOn 8 May 2024, at 14:50, Isaachhh ***@***.***> wrote: @galleon We have released that. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

Isaachhh · Answer 16 · Wed May 08 2024 22:01:13 GMT+0800 (China Standard Time)

@galleon

Training details of model zoo

Guillaume Alleon · Answer 17 · Thu May 09 2024 16:58:33 GMT+0800 (China Standard Time)

Thanks for adding that !Sent from a small deviceOn 8 May 2024, at 16:01, Isaachhh ***@***.***> wrote: @galleon Training details of model zoo Screenshot.2024-05-08.at.22.00.13.png (view on web) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

Wenchao Ding · Answer 18 · Thu May 09 2024 18:03:08 GMT+0800 (China Standard Time)

@Isaachhh hi master, do you have any plan to release the high resolution LLaMA3-based model ?

Isaachhh · Answer 19 · Thu May 09 2024 19:43:34 GMT+0800 (China Standard Time)

@berry-ding Thanks for interest. In following weeks, stay tuned!

Isaachhh · Answer 20 · Sat Jun 01 2024 13:21:26 GMT+0800 (China Standard Time)

@berry-ding Hi, we released Bunny-v1.1-Llama-3-8B-V supporting 1152x1152.

Isaachhh · Answer 21 · Sun Jun 02 2024 20:42:57 GMT+0800 (China Standard Time)

Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.