syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Throws "data must be floating-point" exception after 1k steps

ishandutta2007 opened this issue · comments

Running on LJ dataset.
Basically this is the line where it's breaking

audio.save_wav(waveform, os.path.join(log_dir, 'step-%d-audio.wav' % step))

Starting new training run at commit: None
Generated 32 batches of size 32 in 39.301 sec
Step 1 [43.557 sec/step, loss=0.84572, avg_loss=0.84572]
Step 2 [23.415 sec/step, loss=0.85437, avg_loss=0.85004]
........
........
Step 998 [2.387 sec/step, loss=0.14099, avg_loss=0.14424]
Step 999 [2.387 sec/step, loss=0.14100, avg_loss=0.14422]
Step 1000 [2.380 sec/step, loss=0.14311, avg_loss=0.14418]
Writing summary at step: 1000
Saving checkpoint to: /media/iedc-beast/Disk 1/test/gst-tacotron-master/logs-tacotron/model.ckpt-1000
Saving audio and alignment...
Exiting due to exception: data must be floating-point
Traceback (most recent call last):
File "train.py", line 115, in train
audio.save_wav(waveform, os.path.join(log_dir, 'step-%d-audio.wav' % step))
File "/media/iedc-beast/Disk 1/test/gst-tacotron-master/util/audio.py", line 16, in save_wav
librosa.output.write_wav(path, wav.astype(np.int16), hparams.sample_rate)
File "/usr/local/lib/python3.5/dist-packages/librosa/output.py", line 223, in write_wav
util.valid_audio(y, mono=False)
File "/usr/local/lib/python3.5/dist-packages/librosa/util/utils.py", line 159, in valid_audio
raise ParameterError('data must be floating-point')
librosa.util.exceptions.ParameterError: data must be floating-point
2018-11-24 16:41:57.082342: W tensorflow/core/kernels/queue_base.cc:277] _0_datafeeder/input_queue: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1292, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
[[{{node datafeeder/input_queue_enqueue}} = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/input_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/linear_targets_0_2)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/media/iedc-beast/Disk 1/test/gst-tacotron-master/datasets/datafeeder.py", line 75, in run
self._enqueue_next_group()
File "/media/iedc-beast/Disk 1/test/gst-tacotron-master/datasets/datafeeder.py", line 97, in _enqueue_next_group
self._session.run(self._enqueue_op, feed_dict=feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 887, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1286, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
[[{{node datafeeder/input_queue_enqueue}} = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/input_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/linear_targets_0_2)]]

Caused by op 'datafeeder/input_queue_enqueue', defined at:
File "train.py", line 153, in
main()
File "train.py", line 149, in main
train(log_dir, args)
File "train.py", line 58, in train
feeder = DataFeeder(coord, input_path, hparams)
File "/media/iedc-beast/Disk 1/test/gst-tacotron-master/datasets/datafeeder.py", line 46, in init
self._enqueue_op = queue.enqueue(self._placeholders)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 339, in enqueue
self._queue_ref, vals, name=scope)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3978, in queue_enqueue_v2
timeout_ms=timeout_ms, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3259, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1747, in init
self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Enqueue operation was cancelled
[[{{node datafeeder/input_queue_enqueue}} = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/input_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/linear_targets_0_2)]]

@ishandutta2007 Hi, I guess it is caused by the librosa version. You can modify how to write wave with your environment.

Thanks a lot @syang1993 for answering, I have been trying to reach out to you on multiple platforms for help on this thread for models that people have already built. Not sure people look at older threads. It would be great if you could share atleast the 200k steps(that you shared outputs of) model for us to continue more iterations on top of that.

Hi, I'm so sorry that I'm now doing an internship in a company, I cannot get the pre-trained model (I trained it several months ago when I was doing visiting research in Singapore). You can train it by yourself, it may take about 3 days to get 200K steps.

Well on our gtx 1080 as per my estimate it's taking longer(maybe twice of that). And it is also not always about time, now a days people in the ML world are burning huge amount of compute hours and money unnecessary when sharing can solve it a lot.
Can you share your email/linkedin/twitter etc , you seem to be really deep into Speech Synthesis, keeping in touch may be useful for both of us.

I trained it on P40, which may be faster. Yes you are right, sharing can solve a lot. Maybe this is the purpose of Github :)

I'm not so familiar with linkedin so that I don't know how to share my id, this is the link https://www.linkedin.com/in/yang-shan-182987119

So in China do you use Ushi or Mamai ? Let's see if I can connect via them too. :)

Thanks @syang1993 for the connect. I have triggered the run on our gtx 1080, it would take a month or so to get 500-600 iterations. We need to get it right close to google's performance or else it is unusable for real life scenarios. If you have access to more powerful GPUs, It would be a great favour if you could do a train for larger iterations and share the model with the community. Till now there is no properly trained tacotron with style transfer on internet, this will be the first one.

@ishandutta2007 Usually, google tends to use a lot of GPUs to train such a model. And they use about 200 hours data to get their performance. So I think it's hard to reconstruct their performance. By the way, one of my friend begins to train a model use this repo, I can share it when it's finished.

No wonder why Elon Musk fears of Google colonising the world :D

@syang1993 what's the best communicator/instant messager to keep in touch with you, we shouldn't be discussing stuff not related to the thread, I will switch over to the models thread for further updates on this.

Do let me know what's best to reach you. Don't hesitate even if I need to install wechat or something. In India we use:

@ishandutta2007 We mostly use wechat in China, and my wechat id is ys_think . I also use linkedin (not usual) and gmail: syang.mix@gmail.com

just hit the same error. @ishandutta2007 how did you get around this?

i solved it by changing util / audio.py / save_wav() :

librosa.output.write_wav(path, wav.astype(np.int16), hparams.sample_rate)

to

librosa.output.write_wav(path, wav, hparams.sample_rate)