7B config
zankner opened this issue · comments
I'm trying to replicate the training results for the 7B head. Could you share the training config used in main.py please?
The training configuration is as follows.
train_config={
"lr":3e-5,
"bs":4,
"gradient_accumulation_steps":1,
"is_warmup":True,
"num_epochs":200,
"num_warmup_steps":2000,
"total_steps":800000,
"p_w":0.1,
"v_w":1.0,
"head_w":0.1,
"num_workers":2,
"embeding":True,
"act":"No",
"data_noise":True,
"noise":"uniform",
"mean":0.0,
"std":0.2,
"residual":"true,norm",
"max_len":2048,
"config_path":"config.json",
"b1":0.9,
"b2": 0.95,
"grad_clip": 0.5,
}
Is the global batch size 128 in the end?
The global batch size is 16.