support long context llama 3 models

Question

bachittle opened this issue 24 days ago · comments

Since traditional llama 3 only has 8000 token context window limit, wondering how feasible it would be to run these models?

arnfaldur · Answer 1 · Thu May 16 2024 04:22:28 GMT+0800 (China Standard Time)

This model is a fine-tune of Llama 3 so it is already supported (no change in architecture). They just changed the RoPE theta and trained with that.

Some people were having issues described here https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13 but I expect those to be the same issues the base Llama 3 models had until they were fixed.

You should have followed the enhancement template you were given. Did you try running the model? If so what happened?