jzhang38/TinyLlama Issues
model.py
Updatedmodel结构
UpdatedTraining Run - New Tokenizer
Updated 1Llama 3
Updated 1On which will it run better
UpdatedLoss logs
Closed 2模型和代码欢迎发布到wisemodel.cn开源社区
UpdatedWhy FSDP not DPP?
UpdatedHelp me pls
Updated 2proper way to pad prompts
Closed 2Does it support spanish language?
Updated 7Chat v1.0 training recipe
Closed 1about the speed
Updated 1Activation Checkpointing
Closed 2fp16 finetune will loss=0
UpdatedTPU Pretraining
UpdatedUsage documentation
Closed 1Saturation / epoch-accuracy plot
Closed 6