karpathy / llama2.c

Inference Llama 2 in one file of pure C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

New Visual Walkthrough of Llama2.c

ZoroDerVonCodier opened this issue · comments

@karpathy - thank you for the great software. I wrote up a visual walk-through of how it all works in detail. I think I got it all right and am currently using the software along with the llama2.cu fork by @rogerallen and the 4-bit gpu fork by @ankan-ban. On a Windows PC I am seeing 7t/s on just the CPU, 20+ with fp32 on the GPU and 140+ t/s with the 4-bit gpu. Really fun stuff! I hope this visual walk-through is helpful.

https://www.signalpop.com/2024/02/10/understanding-llama2-c-and-chatgpt-a-visual-design-walkthrough/

Cheers!