jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

jzhang38/TinyLlama Issues

model.py
Updated 2 months ago
model结构
Updated 2 months ago
Training Run - New Tokenizer
Updated 2 months ago1
Llama 3
Updated 2 months ago1
On which will it run better
Updated 2 months ago
Would it be possible to provide help with evaluation?
Updated 2 months ago
A potential bug in multi-GPU training
Closed 2 months ago1
Encountered an issue while loading the model using transformers
Updated 3 months ago1
Loss logs
Closed 7 months ago2
Where is the pretraing example of llama-1.1b-chat
Updated 3 months ago
模型和代码欢迎发布到wisemodel.cn开源社区
Updated 3 months ago
The results under the FastChat framework are quite bizarre?
Updated 3 months ago
Convert weights to original llama weights.
Closed 6 months ago2
On the visualization of Wandb in fine-tuning
Closed 4 months ago
Is there any simple demo of fine-tuning TinyLlama
Closed 4 months ago4
More intermediate checkpoints in < 240k steps
Updated 4 months ago
How to evaluate checkpoints during pretraining?
Closed 4 months ago2
Why FSDP not DPP？
Updated 4 months ago
All 'rotary_emb.inv_freq' are not able to load with Automodel or Llamamodel.
Closed 4 months ago2
A question on learning rate decay schedule
Closed 4 months ago1
Pretraining failing on IndexError: list index out of range in file packed_dataset.py
Closed 4 months ago1
Should this line use args.seed instead of seed=42?
Updated 4 months ago
Help me pls
Updated 4 months ago2
proper way to pad prompts
Closed 4 months ago2
How to compute metrics like ROUGE, BLEU in sft script?
Updated 4 months ago1
Reference for pretraining other small language models
Updated 4 months ago1
Encountered an issue while loading the model using transformers
Updated 4 months ago1
Pre training Continue from TinyLlama-1.1B-intermediate-step-1431k-3T
Closed 5 months ago1
Unable to pretrain: tokenizer raises NotImplementedError
Closed 6 months ago3
Please provide prompt guide for tinyllama-1.1b-chat-v1.0 ?
Closed 5 months ago1
Taking a few days to complete SlimPajama "Train" data
Closed 6 months ago2
Unexpected behavior - Incomplete responses and nonsense outputs
Closed 5 months ago1
The pre-training process crashed after a few iter
Closed 5 months ago1
Is there any function calling model for tinyllama?
Updated 5 months ago1
Hi, how can I finetune tinyllama with a custom dataset as follows?
Updated 5 months ago2
Does it support spanish language?
Updated 5 months ago7
The roadmap has been sitting for a while.
Updated 5 months ago1
No module named 'xentropy_cuda_lib'
Closed 6 months ago1
[Question] Is pre-training with FP32 possible?
Closed 6 months ago2
How do I use these data sets to train new models?
Closed 6 months ago2
how to determine reasonable max steps？
Updated 6 months ago1
Chat v1.0 training recipe
Closed 6 months ago1
about the speed
Updated 6 months ago1
Hello, is this tokenizer using LLaMA’s Tokenizer or did you train it yourself?
Closed 6 months ago1
Activation Checkpointing
Closed 6 months ago2
fp16 finetune will loss=0
Updated 6 months ago
TPU Pretraining
Updated 6 months ago
Usage documentation
Closed 6 months ago1
Can checkpoints in the lit_gpt configuration format be open sourced?
Closed 6 months ago
Saturation / epoch-accuracy plot
Closed 7 months ago6