karpathy / llama2.c

Inference Llama 2 in one file of pure C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Everyone, I have implemented multi-token prediction of InfiniAttention and meta.

win10ogod opened this issue · comments