RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)
theAayushbajaj opened this issue · comments
Dataset is ~6 hrs so trained MOL for ~1000k it on LJ and another 375k on my dataset, the voice is blabbering.
Switching to RAW is giving me this error "RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)"
hparams:
# DSP --------------------------------------------------------------------------------------------------------------#
# Settings for all models
sample_rate = 22050
n_fft = 2048
fft_bins = n_fft // 2 + 1
num_mels = 80
hop_length = 276 # 12.5ms - in line with Tacotron 2 paper
win_length = 1102 # 50ms - same reason as above
fmin = 40
min_level_db = -100
ref_level_db = 20
bits = 9 # bit depth of signal
mu_law = True # Recommended to suppress noise if using raw bits in hp.voc_mode below
peak_norm = False # Normalise to the peak of each wav file
# WAVERNN / VOCODER ------------------------------------------------------------------------------------------------#
# Model Hparams
voc_mode = 'RAW' # either 'RAW' (softmax on raw bits) or 'MOL' (sample from mixture of logistics)
voc_upsample_factors = (2, 6, 23) # NB - this needs to correctly factorise hop_length NOTE:changed
voc_rnn_dims = 512
voc_fc_dims = 512
voc_compute_dims = 128
voc_res_out_dims = 128
voc_res_blocks = 10
# Training
voc_batch_size = 32
voc_lr = 1e-4
voc_checkpoint_every = 25_000
voc_gen_at_checkpoint = 5 # number of samples to generate at each checkpoint
voc_total_steps = 1_000_000 # Total number of training steps
voc_test_samples = 50 # How many unseen samples to put aside for testing
voc_pad = 2 # this will pad the input so that the resnet can 'see' wider than input length
voc_seq_len = hop_length * 4 # must be a multiple of hop_length NOTE:changed
voc_clip_grad_norm = 4 # set to None if no gradient clipping needed
# Generating / Synthesizing
voc_gen_batched = True
voc_target = 11_000
voc_overlap = 550
System:
Pytorch: 1.7.0+cu101
OS: Ubuntu 18.04
GPU: GTX 1080
Complete Traceback+Error:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [96,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [97,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [98,0,0] Assertion `t >= 0 && t < n_classes` failed.
"Similar errors with different block dimensions"
Traceback (most recent call last):
File "train_wavernn.py", line 161, in <module>
main()
File "train_wavernn.py", line 87, in main
voc_train_loop(paths, voc_model, loss_func, optimizer, train_set, test_set, lr, total_steps)
File "train_wavernn.py", line 128, in voc_train_loop
loss.backward()
File "wavernn/lib/python3.6/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "wavernn/lib/python3.6/site-packages/torch/autograd/__init__.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`