Memory and Time for pretraining
wjun0830 opened this issue · comments
WonJun Moon commented
Hello Kevin!
Can you provide how much memory space and time is spent for doing PT?
Thanks
Kevin commented
Hi @wjun0830 ,
For pretraining, we run on 8GPUs, with less than 24 GB when bsz =32;
10 epoch typically requires 3-4 days;
You can flexible decrease the bsz or transformer project dimension for less memory usage and higher efficiency
WonJun Moon commented
Thank you!