meta-llama / codellama

Inference code for CodeLlama models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to enable the sending of 100K tokens for codellama

harlequen opened this issue · comments

Hi, I'm testing codellama, and I would like a guide on how to enable it to accept the sending of 100K input tokens. From what I understand, this is done by adjusting the max_seq_len and max_batch_size parameters, but I couldn't find more information on this. Please, if someone could guide me.

As an extra question, do you know what the default size it would accept is?

Thank you, and I will appreciate any help.