ZhikangNiu / encodec-pytorch

unofficial implementation of the High Fidelity Neural Audio Compression

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train on librispeech

zhaojingxin123 opened this issue · comments

Hello, using LibriTTS960 training about how many epochs, the model can be used. Can you share some config or training information?

Hello, using LibriTTS960 training about how many epochs, the model can be used. Can you share some config or training information?

maybe 20 epoch? you can check the hf model.

Thank you for your answer. 感谢,我训练的非常慢,batchsize=2,log_interval: 100,tensor_cut: 240000,每100batch输出一次log,num_workers: 10,用时1min30秒左右,
然后batchsize=6的时候,用时是4min30s左右,好奇batch改大为什么时间没有变短,怎么能让训练变快呀?再次感谢
Why my training is so slow,

Snipaste_2024-07-02_14-58-07 My common: save_interval: 2 test_interval: 5 log_interval: 100 max_epoch: 30 seed: 3401 amp: False datasets: # 训练测试数据位置 train_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_train.csv' test_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_test.csv' batch_size: 2 # 320000 tensor_cut: 240000 num_workers: 10 fixed_length: 0 pin_memory: True # 是否断点继续训练 checkpoint: resume: True checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_lr0.0003.pt' disc_checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_disc_lr0.0003.pt' # 保存位置 save_folder: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints' save_location: '${checkpoint.save_folder}/bs${datasets.batch_size}_cut${datasets.tensor_cut}_length${datasets.fixed_length}_'

optimization:
lr: 3e-4
disc_lr: 3e-4

lr_scheduler:
warmup_epoch: 5

model:
target_bandwidths: [1.5, 3., 6., 12., 24.]
sample_rate: 24_000
channels: 1
train_discriminator: True
audio_normalize: True
filters: 32
ratios: [8, 5, 4, 2]
disc_win_lengths: [1024, 2048, 512]
disc_hop_lengths: [256, 512, 128]
disc_n_ffts: [1024, 2048, 512]
distributed:
data_parallel: False
world_size: 4
find_unused_parameters: False
torch_distributed_debug: False
init_method: tmp

Thank you for your answer. 感谢,我训练的非常慢,batchsize=2,log_interval: 100,tensor_cut: 240000,每100batch输出一次log,num_workers: 10,用时1min30秒左右, 然后batchsize=6的时候,用时是4min30s左右,好奇batch改大为什么时间没有变短,怎么能让训练变快呀?再次感谢 Why my training is so slow,

Snipaste_2024-07-02_14-58-07 My common: save_interval: 2 test_interval: 5 log_interval: 100 max_epoch: 30 seed: 3401 amp: False datasets: # 训练测试数据位置 train_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_train.csv' test_csv_path: '/home/zjx/home/zjx/encodec-pytorch-main/datasets/librispeech_train100h_test.csv' batch_size: 2 # 320000 tensor_cut: 240000 num_workers: 10 fixed_length: 0 pin_memory: True # 是否断点继续训练 checkpoint: resume: True checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_lr0.0003.pt' disc_checkpoint_path: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints/bs2_cut240000_length0_epoch14_disc_lr0.0003.pt' # 保存位置 save_folder: '/home/zjx/home/zjx/encodec-pytorch-main/checkpoints' save_location: '
𝑐

𝑒
𝑐
𝑘
𝑝
𝑜
𝑖
𝑛
𝑡
.
𝑠
𝑎
𝑣
𝑒
𝑓
𝑜
𝑙
𝑑
𝑒
𝑟
/
𝑏
𝑠
{datasets.batch_size}cut
𝑑
𝑎
𝑡
𝑎
𝑠
𝑒
𝑡
𝑠
.
𝑡
𝑒
𝑛
𝑠
𝑜
𝑟
𝑐
𝑢
𝑡
𝑙
𝑒
𝑛
𝑔
𝑡

{datasets.fixed_length}
'
optimization: lr: 3e-4 disc_lr: 3e-4

lr_scheduler: warmup_epoch: 5

model: target_bandwidths: [1.5, 3., 6., 12., 24.] sample_rate: 24_000 channels: 1 train_discriminator: True audio_normalize: True filters: 32 ratios: [8, 5, 4, 2] disc_win_lengths: [1024, 2048, 512] disc_hop_lengths: [256, 512, 128] disc_n_ffts: [1024, 2048, 512] distributed: data_parallel: False world_size: 4 find_unused_parameters: False torch_distributed_debug: False init_method: tmp

你可以减少tensor_cut到1s ≈ 24000采样点。然后增大batch size训练,另外 numworkers不要设置那么大吧

感谢大佬指点,这就是试一下

Snipaste_2024-07-02_16-17-31 ![image](https://github.com/ZhikangNiu/encodec-pytorch/assets/110162364/cb03d35a-b8f9-4615-b52b-a2626d6ccd56) 感谢,感谢!!!!!

Snipaste_2024-07-02_16-17-31 image 感谢,感谢!!!!!

但是这样对于音频质量是否有影响我暂时未探索过