How to train on librispeech
zhaojingxin123 opened this issue · comments
Hello, using LibriTTS960 training about how many epochs, the model can be used. Can you share some config or training information?
Hello, using LibriTTS960 training about how many epochs, the model can be used. Can you share some config or training information?
maybe 20 epoch? you can check the hf model.
Thank you for your answer. 感谢,我训练的非常慢,batchsize=2,log_interval: 100,tensor_cut: 240000,每100batch输出一次log,num_workers: 10,用时1min30秒左右,
然后batchsize=6的时候,用时是4min30s左右,好奇batch改大为什么时间没有变短,怎么能让训练变快呀?再次感谢
Why my training is so slow,
optimization:
lr: 3e-4
disc_lr: 3e-4
lr_scheduler:
warmup_epoch: 5
model:
target_bandwidths: [1.5, 3., 6., 12., 24.]
sample_rate: 24_000
channels: 1
train_discriminator: True
audio_normalize: True
filters: 32
ratios: [8, 5, 4, 2]
disc_win_lengths: [1024, 2048, 512]
disc_hop_lengths: [256, 512, 128]
disc_n_ffts: [1024, 2048, 512]
distributed:
data_parallel: False
world_size: 4
find_unused_parameters: False
torch_distributed_debug: False
init_method: tmp
Thank you for your answer. 感谢,我训练的非常慢,batchsize=2,log_interval: 100,tensor_cut: 240000,每100batch输出一次log,num_workers: 10,用时1min30秒左右, 然后batchsize=6的时候,用时是4min30s左右,好奇batch改大为什么时间没有变短,怎么能让训练变快呀?再次感谢 Why my training is so slow,
My common: save_interval: 2 test_interval: 5 log_interval: 100 max_epoch: 30 seed: 3401 amp: False datasets: # 训练测试数据位置 train_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_train.csv' test_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_test.csv' batch_size: 2 # 320000 tensor_cut: 240000 num_workers: 10 fixed_length: 0 pin_memory: True # 是否断点继续训练 checkpoint: resume: True checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_lr0.0003.pt' disc_checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_disc_lr0.0003.pt' # 保存位置 save_folder: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints' save_location: '
𝑐
ℎ
𝑒
𝑐
𝑘
𝑝
𝑜
𝑖
𝑛
𝑡
.
𝑠
𝑎
𝑣
𝑒
𝑓
𝑜
𝑙
𝑑
𝑒
𝑟
/
𝑏
𝑠
{datasets.batch_size}cut
𝑑
𝑎
𝑡
𝑎
𝑠
𝑒
𝑡
𝑠
.
𝑡
𝑒
𝑛
𝑠
𝑜
𝑟
𝑐
𝑢
𝑡
𝑙
𝑒
𝑛
𝑔
𝑡
ℎ
{datasets.fixed_length}'
optimization: lr: 3e-4 disc_lr: 3e-4lr_scheduler: warmup_epoch: 5
model: target_bandwidths: [1.5, 3., 6., 12., 24.] sample_rate: 24_000 channels: 1 train_discriminator: True audio_normalize: True filters: 32 ratios: [8, 5, 4, 2] disc_win_lengths: [1024, 2048, 512] disc_hop_lengths: [256, 512, 128] disc_n_ffts: [1024, 2048, 512] distributed: data_parallel: False world_size: 4 find_unused_parameters: False torch_distributed_debug: False init_method: tmp
你可以减少tensor_cut到1s ≈ 24000采样点。然后增大batch size训练,另外 numworkers不要设置那么大吧
感谢大佬指点,这就是试一下