karpathy / llama2.c

Inference Llama 2 in one file of pure C

karpathy/llama2.c Issues

Questions about converting models and tokenizers downloaded from huggingface
Updated 8 months ago
tok512.model adding a token at start.
Updated 8 months ago
About runq.c
Updated 8 months ago
I added bidirectional attention, and those who need it can study it.
Updated 3 months ago5
"Can I increase or decrease the size of individual model layers separately?"
Updated 7 months ago2
Is this project still active?
Updated 8 months ago7
Question: Sliding window attention
Updated 7 months ago3
Questions about the matmul function in run.c
Updated 8 months ago
I found that the dim parameter affects the learning loss and n_layers affects the training speed.
Updated 8 months ago
How to convert to huggingface model format?
Closed 6 months ago1
The trained model will not be saved.
Closed 8 months ago
Q: How to finetune?
Updated 8 months ago2
Please double check formula for hidden dimension
Closed 8 months ago
Export tokenizers to huggingface (eg: Tinystories260K)
Updated 8 months ago
Is it possible to increase or decrease the size of only some of the layers of the model structure?
Updated 8 months ago
How to save checkpoints at each step?
Updated 8 months ago1
-
Closed 8 months ago
How to convert the huggingface model with GQA to bin?
Updated 8 months ago4
How does this part of the Train code work?
Updated 9 months ago1
Mojo version?
Updated 9 months ago2
What is a good pretrain dataset for llama2c?
Updated 5 months ago3
Evolution of tinystories. Open sourced.
Updated 9 months ago3
[Feature Request] Support InternLM Deploy
Updated 9 months ago
Is it possible to adapt this code from DDP to FSDP? If yes, what are the potential issues to look out for?
Updated 9 months ago
Pure JavaScript port of llama2.c
Closed 8 months ago
llama2_7b_chat have no any response
Closed 9 months ago1
Incorrect parameter counts for 15M, 42M, 110M models?
Updated 5 months ago3
Optimized code for matmul() works 3.5 faster (for Mac M1 Max with ARM NEON) ... and even more...
Updated 9 months ago4
Interpretability of models
Updated 9 months ago4
Trained and LoRA fine-tuned the models to follow instructions for writing tiny stories
Updated 9 months ago8
HF candle
Updated 9 months ago
llama2.c text generation Inference server in c
Updated 9 months ago
run export.py with chat-llama-2-7b-chat-hf, then memory is over.
Updated 9 months ago1
how to generate word embeddings when doing custom tokenizers
Updated 9 months ago
Chat functionality requires big 7B model
Updated 9 months ago5
Code Llama rope_theta parameter
Updated 9 months ago2
260K Model Parameter count not right?
Updated 9 months ago1
How should I calculate the parameter count of a model?
Updated 9 months ago1
why not use key and value caches in model.py?
Updated 9 months ago2
`.bin` vs `.pt` size discrepency
Closed 9 months ago1
New export code OOM with 7B model
Updated 9 months ago5
Possible issue in decode()
Closed 9 months ago7
Reasoning behind version1_export logic
Updated 9 months ago2
-n 0 makes no tokens (21/08/2023 pulls)
Closed 9 months ago3
Suggestion: Is it possible to reorganize the file structure
Updated 9 months ago5
File names discussion
Updated 9 months ago3
convert ckpt.pt to huggingface model
Updated 2 months ago15
Variable Name Clarification to Improve Code Accessibility
Closed 9 months ago
Error in converting huggingface models
Updated 9 months ago3
Stuck on training: Created a PretokDataset with rng seed 42
Updated 7 months ago22